Assembly mimatch between windows and linux llvm.(probably caused by sort algorithm)

To whom it may concern,

I’m running some testcases(A and B) in Linux LLVM(built in Ubuntu16.04) and Windows LLVM(built by Visual Studio 2015), both of which were LLVM 4.0.0 and built with same source codes, but I got different assembly files(A_Linux != A_Windows, B_Linux = B_Windows). Privacy reasons prevent me from sharing my testcases here, sorry.

I compared debug information and found the root cause in MachinePipliner.cpp that Node orders differed after sort algorithm.

There were two std::sort in this file and when I replaced them with ‘std::stable_sort’, these two assembly files became the same but other assembly files differed(A_Linux = A_Windows, B_Linux != B_Windows).

I cannot figure out the reason and could you give me some advice why this happened and what sort algorithm should I use to get exactly same assembly files?

Best regards,

Ruobin

Hi Ruobin,

I have had similar problems in the past, and it is generally caused by the hash value being used to determine relative ordering between two (or more) values which are otherwise considered equal. Since the implementation of the hashing is different between VC++ and the one in ‘libstdc++’ or LibC++, this can result in apparently non-deterministic ordering.

In my case the problem occurred when sorting BBs during scheduling, but it is likely that you are seeing something similar. I don’t remember the exact details, but I think I resolved it by using the BB# when my ordering test indicated that the values were equal.

If it is any consolation, this does not generally result in “wrong” code, just “different” code.

MartinO

Hi MartinO,

Thanks for your answer, but I still cannot figure out why std::stable_sort generated two different result. As far as I know, the order of equivalent elements is guaranteed to be preserved by std::stable_sort regardless of VC++ and libstdc++.

Ruobin.

If LLVM uses containers of pointers, then be aware that the relative ordering of pointers is completely undefined. It will vary between Linux and Windows, and it will even vary from run to run on the same system due to address space layout randomization (ASLR).

I have noticed with past releases (3.7?) that LLVM bitcode output is not ASLR-stable. I am not sure if this is still the case. As MartinO had remarked, this is not necessarily “wrong” in that the code generated is still functionally correct, but it may complicate testing or other applications such as incremental compilation.

Interestingly, even through the bitcode is not stable, I have also noticed that if you write out assembly from the bitcode, then the assembly output from two different runs WILL be identical, even if the bitcode is not.

I noticed this while attempting to use a hash over the bitcode as a gating condition for incremental compilation. (Don’t recompile if hash has not changed since the last time bitcode was generated.) Using bitcode directly resulted in gratuitous recompilation due to the ASLR-instability. I subsequently used a hash over my own intermediate code format, which I ensured was ASLR stable, and that works just fine.