I’ve run into what appears to be a bug in ScalarEvolution, but I’m not sure if it is instead caused by me missing an implicit assumption between LSR and SCEV.
This issue is caused by the change D18001, which is an attempt to increase SCEV-inserted instruction re-use by picking a more canonical insertion position in the case where a new insert position block is not found.
The problem I have run into with this change is that it causes an insert position instruction to be chosen which is later hoisted into a different block by SCEVExpander::hoistIVInc(). This in turn makes the SCEVExpander Builder InsertPt inconsistent (the instruction and block end up not matching), which leads to an invalid instruction being inserted (with its parent != its containing block).
Does anyone more familiar with the design of the LSR/SCEV interaction have an opinion on what the right approach to fixing this is? The two approaches I’ve considered are: 1) avoid choosing an insert point that SCEV will move or 2) make SCEVExpander handle the case where it moves an instruction that is an insert point. I’ve attempted (2), but it gets ugly pretty fast since there are quite a few different insert points and builders saved/restored. (1) also seems tricky, since it essentially means LSR will have to predict what SCEVExpander is going to do and work around this case.
Any thoughts would be most appreciated,
How was it handled before your change? Were we just lucky that insert points always matched?
Would it be possible to move all the logic for finding a proper insertion into SCEVExpander? I think that is ultimately the best solution for this, but I have to admit I looked into this code a while ago.
I neglected to mention one other important detail. The larger goal I’m working towards is getting loop invariant outputs sunk outside of loops after LSR. Without doing the sinking, this shortcoming in LSR/SCEV often gets fixed by later passes (e.g. ISel) combining the redundant instructions when they are in the same block. This isn’t always the case though, as I do see some scattered code improvements with the original patch in isolation.
It seems like it might be possible to move all of the insertion point calculation into SCEV since LSR doesn’t seem to use the insert point other than calculating the initial value and then passing it in/out of Rewriter.exandCodeFor(), though I would appreciate hearing from someone more experienced in the interaction between these two passes before going that route.
Firstly, I think it will be great if you have a small reproducer of
this issue (which I can make fail after re-applying D18001 to ToT).
I’ve run into what appears to be a bug in ScalarEvolution, but I’m not sure
if it is instead caused by me missing an implicit assumption between LSR and
This issue is caused by the change D18001, which is an attempt to increase
SCEV-inserted instruction re-use by picking a more canonical insertion
position in the case where a new insert position block is not found.
The problem I have run into with this change is that it causes an insert
position instruction to be chosen which is later hoisted into a different
block by SCEVExpander::hoistIVInc(). This in turn makes the SCEVExpander
When is the insert position moved? IOW, what is the "timeline" of
There are two calls to hoistIVInc in SCEVExpander, in
getAddRecExprPHILiterally and in replaceCongruentIVs -- which one is
the problematic case?
Builder InsertPt inconsistent (the instruction and block end up not
matching), which leads to an invalid instruction being inserted (with its
parent != its containing block).
Thanks for your interest. I've attached a reduced reproducer of this issue. You can trigger it by running the following opt command after applying my recently updated patch D18480 (note the change in review number since Phabricator wouldn't let me re-open the original review):
$ opt -loop-reduce lsr-scev-hoist-insertpt.ll
Below is a call stack at the point where the insertion point is hoisted into a different block. The line numbers are a little off because of some debugging code I've added, but they should be pretty close.
The IR instruction that gets picked as an insertion point and gets hoisted is:
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
The BaseReg SCEV being expanded when this happens is:
2 in llvm::SCEVExpander::hoistIVInc of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:940
3 in llvm::SCEVExpander::getAddRecExprPHILiterally of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:1120
4 in llvm::SCEVExpander::expandAddRecExprLiterally of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:1294
5 in llvm::SCEVExpander::visitAddRecExpr of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:1382
6 in llvm::SCEVVisitor<llvm::SCEVExpander, llvm::Value*>::visit of llvm/include/llvm/Analysis/ScalarEvolutionExpressions.h:482
7 in llvm::SCEVExpander::expand of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:1687
8 in llvm::SCEVExpander::expandCodeFor of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:1599
9 in llvm::SCEVExpander::expandCodeFor of llvm/lib/Analysis/ScalarEvolutionExpander.cpp:1594
10 in (anonymous namespace)::LSRInstance::Expand of llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:4498
11 in (anonymous namespace)::LSRInstance::Rewrite of llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:4725
12 in (anonymous namespace)::LSRInstance::ImplementSolution of llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:4775
13 in (anonymous namespace)::LSRInstance::LSRInstance of llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:4897
14 in (anonymous namespace)::LoopStrengthReduce::runOnLoop of llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp:5022
lsr-scev-hoist-insertpt.ll (1.54 KB)
Thanks for the detailed reproduction instruction -- they were very
helpful, and I can now see the bug.
Generally, I think (2) is the better solution. It should be up to
SCEVExpander to keep its state consistent. The fact that we're trying
to optimize instructions as we insert them makes that trickier, but I
still think that is the right invariant to have.
I don't see why (2) would be particularly ugly though (perhaps I'm not
being sufficiently imaginative ). I think
IRBuilder::SetInsertPoint(Instruction *) is safe, since it always sets
BB and IP in sync with each other. So I think we basically need to
sync things at restoreIP() (which is called only once in the file) and
at places that potentially move insertion points (like hoistIVInc)
using something like
"Builder.SetInsertPoint(&*Builder.GetInsertPoint())". Are there any
other places that need to be fixed too?
We should also add a LSRInserter that asserts that the basic block
we're inserting into is the same as the insertee's parent.
I know it's been a while, but I finally got back to this and have made an attempt at fixing this issue w/ SCEV using approach (2) as previously discussed:
Let me know what you think when you get a chance.