Enabling EarlyCSE w/ MemorySSA by default

Hi All,

I'd like to switch the function simplification run of EarlyCSE to use MemorySSA by default. This change has been in tree for a while and I've done benchmark and self-hosting testing on both aarch64 and x86, but I wanted to give other folks with different tests/targets a chance to try this out before I flip the switch. I've added an option '-enable-earlycse-memssa' (or '-enable-npm-gvn-hoist' if you're using the new pass manager) that you can be used to test this change out.

Thanks,

Not that I don't trust your self-hosting test strategy, but other
passes (e.g. GVNHoist or NewGVN) showed us that's not quite enough to
declare victory. I'll be much more confident if we can do at least a
round of fuzzing.

Thanks,

+1
If possible, it definitely helps to fuzz it.

Sure, we are happy to start some testing to see what we may help find.

For what it’s worth, I just ran this on PowerPC and a double bootstrap with lit and lnt tests passes everything.

I fixed the only bug that has been reported (there were others, but
turned out to be issues in other passes).
Zhendong, did you find anything else? If not, maybe let's wait another
week to give people another chance to test and then decide whether
to flip the switch.

Sounds good to me.

EarlyCSE w/ MemorySSA has been enabled by default as of r306477

Can you share you compile-time and memory footprint measurements at least for CTMark? For a new pass/feature it would be great to share this with the community before you commit. Or did I miss them?

Thanks
Gerolf

Our bots detected the following compile-time slowdowns between r306475 and r306478 (I’d suspect that r306477 is causing it).

O3/LTO:
CTMark/lencod/lencod 2.3%
CTMark/tramp3d-v4/tramp3d-v4 2.1%
CTMark/SPASS/SPASS 1.8%
CTMark/sqlite3/sqlite3 1.6%
CTMark/ClamAV/clamscan 1.3%
CTMark/mafft/pairlocalalign 0.9%
CTMark/kimwitu++/kc 0.9%
CTMark/Bullet/bullet 0.9%
CTMark/7zip/7zip-benchmark 0.8%

Os:
CTMark/SPASS/SPASS 1.2%
CTMark/tramp3d-v4/tramp3d-v4 1.1%
CTMark/sqlite3/sqlite3 0.9%

Thanks,
Michael

To try to explain this:
Originally, gvnhoist was enabled by default.
When that was happening, doing earlycse w/memoryssa would not have resulted in any meaningful slowdown in ctmark because they were next to each other in the pipeline, and both updated memoryssa.

Once gvnhoist was disabled again, now this appears as if it is a regression.

(and it is, just not from “the plan of record”, just from “the current state”).

IE At the time it had been decided that gvnhoist was worth the X% compile time cost. using MemorySSA in EarlyCSE did not have any overhead above that. Now that GVNHoist is disabled again, …

The percentages above are what one can expect to pay to use MemorySSA in the compiler.
We actually make all the percentages back (and then some) when NewGVN is on and GVN is off, and you can expect compile time to decrease as more things are moved to be MemorySSA based. In fact, had EarlyCSE used Memdep, i’m sure it’d be a compile time win already.

But previously, it used a simple O(1) store tracking scheme.
So the above is the cost of “building MemorySSA at all”.