I’m working on the S3D Benchmark and notice that CodeGenPrepare::OptimizeMemoryInst(…) in lib/Transforms/Scalar/CodeGenPrepare.cpp isn’t scaling well. A whopping 99.3% of the compilation time for one file is spent in this function. The User Time for this compilation is 3166 seconds with a stock LLVM 3.1. If I disable the calls to OptimizeMemoryInst(…), the compilation time drops to 76 seconds.
It appears that OptimizeMemoryInst(…) was tweaked for compilation time improvements in the past, but unfortunately, this particular case was not covered. S3D has lots of matrix element references. For these references, our proprietary optimizer has already sunk the addressing code so that it shows up in the reference’s Basic Block. OptimizeMemoryInst(…) ends up returning early with no changes made for almost every load/store in this particular file of S3D.
Before I try to tackle this issue, I would like to find out if CodeGenPrepare::OptimizeMemoryInst(…) has attracted anyone’s attention already. If so, any insight into this problem would be appreciated.
If it helps, it’s clear that the lion’s share of time spent in OptimizeMemoryInst(…) is actually spent matching the address mode of the loads/stores with AddressingModeMatcher::MatchAddr(…).