It looks like the DeadStoreElimination optimization doesn't work across BasicBlock boundaries. The project I'm working on (https://github.com/trailofbits/mcsema), would tremendously benefit from even simple cross-block DSE.
There was a patch to do non-local DSE few years ago (http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028751.html), but seems that the patch was never merged.
Is there an existing way to do cross-block DSE?
Was there something wrong with the original non-local DSE patch that it wasn't merged?
If you have specific small examples which aren’t getting caught, I’d be interested in seeing test cases (as bugs.) EarlyCSE handles limited cases here. GVN probably could, but doesn’t seem to. (Which surprises me.) I suspect it got lost in review. It looks like there were some quality issues with the original implementation and the cycle never got closed.
I’ve attached two simplified examples: does_optimize.ll and does_not_optimize.ll.
In both examples, the load/store of %STACK_BASEVAL and %STACK_LIMITVAL should be removed, since the values are never used. The difference is that in does_optimize.ll, the store is in the same block as the load, and in does_not_optimize.dll, the store is in a different block.
When optimizing does_optimize.ll, the DeadStoreElimination pass removes the store as as “Remove Store of Load from same pointer” (http://llvm.org/docs/doxygen/html/DeadStoreElimination_8cpp_source.html#l00510).
When optimizing does_not_optimize.ll, the store is not removed, because the corresponding load is in a different block.
In both cases, we load a value and then store in back into the same location, so both load and store should be optimized out.
does_not_optimize.ll (1.99 KB)
does_optimize.ll (1.99 KB)
To do it “right”, you’d have to build SSU and do PRE in the reverse direction, which GVN doesn’t do. The patch posted actually does this “right” (building SSU and using it), though i haven’t looked through it to see if it just does SSU building and usage, or actually does PRE. Either way, it’s a good enough start
For normal cross-block cases, GVN does not value number loads or stores, so it will never discover this, unless it can do so through the load’s direct dependency.
IMHO, cleaning up the SSU based patch and getting it in would be the right way to go. It may require some post-dominance speedups and cleanups to get it fast enough and right enough (I know everyone hates the post-dominator tree, but there is no other way to do proper store sinking)