FYI -- potential compile time regression on boost spirit with r152737 and/or r152752

Justed wanted to drop folks a note in case they started investigating issues…

Eric let me know that he was seeing a significant compile time regression (3x!!!) for O2 builds of Boost spirit on the nightly testers. The really weird thing is that this was only happening for the ARM targeted build. =/ Very strange, and makes it more likely that there is a smoking gun of “oh, oops”.

I strongly suspect one (or both) of the inliner changes I made as they were specifically targeting C+±y template and header-based code. I’ll be looking into these first thing in the morning, and I’ll revert if there isn’t any quick fix so that bots get back on their feet.

On the flip side, there seem to be some significant performance improvements for other benchmarks. =/ Hopefully the compile time issues can be sorted out reasonably.
-Chandler

Just a brief follow-up, mostly relaying my findings from IRC:

I looked in depth at loop_unroll to see why it slowed down. The inliner run far 4.2% of the time, 2.7% of which was spent actually doing inlines. So the cost analysis is not hurting us here.

However, we are spending quite a bit of time in the optimizations I expect to benefit from better inlining: InstCombine, LSR, and GVN.

And, thankfully, we’re getting significant runtime improvements from the time spent in these optimizers, so they aren’t going of the deep end, they’re actually simplifying code (if perhaps not as quickly as we’d like).

The conclusion seems to be that these patches are fine, and we just need to keep pressure on the scalar optimization passes to run as efficiently as possible. The improved inlining costs us compile time but seems to pay handsomely at runtime.

If folks have other significant compile-time regressions, I would be interested in having repro instructions. =]

Last but not least, the refactoring to do inline cost analysis per-callsite may actually make the analysis faster in several situations. As I’m going through this I’m finding lots of inefficiencies in the current design that should be fixed along the way.

-Chandler