Inverted scheduling of grouped "store" operations?

Whenever possible, the WASM backend inverts the order of store operations. As far as I understand this happens at the iSel+scheduling stage.

Is there any reason why this behavior should be considered correct?

Here is an example:

This behavior is correct as long as the stores are not atomic or volatile, since the order is not observed by the function itself, and there are no guarantees about observability by any other thread or external system.
Whether it’s optimal is another question :slight_smile: I would expect to maybe have better cache behavior if the stores happen in order of increasing address, for example. Is this causing a particular issue or you?

Yes, I didn’t mean that it could cause an error in the program, but if this behavior has no reason, it’s hard for me to consider it correct. I still don’t understand whether this behavior is intentional or not.

Unfortunately I don’t have detailed information about how JIT compilers in browsers handle this code. But my simple benchmarks (with code similar to above) show that this transformation leads to performance degradation.

If you can confirm that this is not intentional behavior, then I can gather more information on how it affects the performance of the programs.

As far as I know the behavior is not intentionally done by the Wasm backend specifically (perhaps it’s an artifact of bottom-up scheduling of the DAG?). Not sure more generally. If you compile for other architectures without fancy SIMD instructions the behavior doesn’t seem consistent (e.g. if you compile for i386 with -mno-sse you get the movs in ascending order but if you compile for arm32 you get sort of an interleaving).

Yes, I debugged this case in llc and it is not directly related to WebAssembly. It happens in ScheduleDAGRRList.cpp/ScheduleDAGSDNodes.cpp. I just don’t have enough knowledge of the LLVM codebase to understand why it does that.