I’d like to check if there are any thoughts on speculatively creating IR and throwing it away if not needed. We do this already in a couple of places, like
- LoopIdiomRecognize: expands SCEV expressions and removes them again if unused
- NewGVN: constructs IR instructions outside of basic blocks, applies simplifications to them.
I’d like to add another use-case of that pattern, which probably triggers quite a bit more frequently and also speculatively constructs larger snippets of IR. My goal is to teach LoopVectorize to more aggressively generate runtime-checks, if their overhead is small/cheap compared to the loop it guards. Currently there is a hard-coded cut-off on the maximum number runtime checks to generate, which is too pessimistic for loops with large bodies and/or large trip counts.
My plan is to instead
- generate the IR for the runtime checks after cost-modeling,
- then use LoopVectorize’s cost-model to estimate the overhead of the runtime checks,
- skip vectorization if the overhead is too big.
In that case, the IR for the runtime checks is removed again (using SCEVExpanderCleaner, like in LoopIdiomRecognize). The big advantage of this solution is that we can easily compare the cost of the checks and the loop, using the same metric (LV’s cost-model)
This approach of course means we need to make sure we remove the generated IR if it is not needed. We already have most of the necessary tools (like SCEVExpanderCleaner) and verifier support to detect cases where a pass claims to not modify IR, but did make changes.
Given that this might push the envelop of the pattern a bit, I’d like to check if there are any concerns with using this pattern more widely.
ps: If you are curious what the code looks like, please take a look at