llvm (the middle-end) is getting slower, December edition

Sent from my Verizon Wireless 4G LTE DROID

First of all, sorry for the long mail.
Inspired by the excellent analysis Rui did for lld, I decided to do
the same for llvm.
I’m personally very interested in build-time for LTO configuration,
with particular attention to the time spent in the optimizer.

From our own offline regression testing, one of the biggest culprits in our experience is Instcombine’s known bits calculation. A number of new known bits checks have been added in the past few years (e.g. to infer nuw, nsw, etc on various instructions) and the cost adds up quite a lot, because the cost is paid even if Instcombine does nothing, since it’s a significant cost on visiting every relevant instruction.

FWIW, I’ve started working on a patch to add a cache for InstCombine’s (ValueTracking’s) known-bits calculation. I hope to have it ready for posting soon.

-Hal

That sounds great! Last time I looked into compiletime ~10 months ago I also saw computeKnownBits as the biggest performance problem.Little things like load/store optimization calling computeknownbits in an attempt to improve the alignment predictions on the loads/store leading to many nodes getting queried over and over again. Feel free to add me as a reviewer!

  • Matthias