SCEV determines the inner loop induction variable to be loop-invariant at the scope of the outer loop

Dear all,

For the following IR which is essentially a doubly nested loop, if we get the SCEV expression for the inner loop induction variable, i.e., %j.018, at the scope of the outer loop using getSCEVAtScope(), the result is: 9. That is a constant, or loop-invariant.

However, %j.018 does keep changing within the scope of the outer loop since it is the induction variable of the inner loop, so it is not straightforward to me why %j.018 is considered a loop-invariant. Something like “{0,+,1}<%for.body4>” would make more sense to me.

I’m wondering if I can get any comments on that?

Best regards,

Congzhe

%j.018 is variant in the innermost loop (for.body4), but after
existing that loop, it will have the value before leaving the loop. If
you intent to use %j.018 in the innermost loop, you need to call
getSCEVAtScope() with the innermost loop as the scope. getSCEVAtScope
with a scope outside the loop (or NULL) will try to derive the exit
value.

Michael

Hi Michael,

Thanks for the reply! As you’ve seen, my purpose is indeed to use getSCEVAtScope(V, L) with V being an instruction/value in the inner loop and L being the outer loop.

  • I can somewhat see that if V is the induction variable for the inner loop, then getSCEVAtScope(V, L) tries to derive the backedgeTakenCount of the inner loop. Is it always correct?

  • I’m also wondering if V is not the induction variable for the inner loop but some other value inside the inner loop (so this is a more general situation), what would be the expected behavior of getSCEVAtScope(V, L)? It would be ideal if you could let me know the logic in SCEV regarding this situation.

Thanks again,
Congzhe

Hi Congzhe,

Could you clarify the first question? Whose correctness is in question? backedgeTakenCount() is supposed to always be correct of course. That is different from whether it can always compute the backedge taken count.
Maybe not and it’ll tell you.

For the second question: There’s no special distinction about induction variables. The same logic is for them or other variables and the basic question is “Does your value change because of the loop?”

Now, if you want more resources on understanding the general concepts of scalar evolution, then I recommend these two:

  1. http://users.uoa.gr/~sdi1600105/compilers/introduction-to-scalar-evolution.html
  2. https://www.youtube.com/watch?v=AmjliNp0_00

Disclaimer: The first resource is an article of mine. The reason I recommend it is because I created it because I think it is a very quick and very intuitive introduction, that uses a different intuition than I usually see in tutorials (although this intuition is the one it seems that almost all experienced devs use in their minds).
That said, I would definitely recommend watching 2) either you read 1) or not.

Best,
Stefanos

Στις Τετ, 3 Φεβ 2021 στις 10:16 μ.μ., ο/η congzhe cao via llvm-dev <llvm-dev@lists.llvm.org> έγραψε:

Hi Stefanos,

Thanks a lot for providing the resources - they are definitely helpful.

Best regards,
Congzhe

Hi Congzhe,

Glad they helped! Feel free to ask any other questions either here or on Discord.

Best,
Stefanos

Στις Κυρ, 7 Φεβ 2021 στις 10:30 μ.μ., ο/η congzhe cao <congzhecao@gmail.com> έγραψε:

Hi there,

Can I have one further question please? Currently SCEV uses computeLoopDisposition(SCEV *S, Loop *L) to check if a SCEV expression is a loop invariant. The concern here is that it does not handle cmp instructions very well, because cmp instructions are not recognized as a SCEV type - it is merely an “scUnknown” type. For scUnknown, computeLoopDisposition() only checks if L contains the underlying instruction from S and if that is the case, it determines S is not a loop invariant, which may not be very accurate.

However for a cmp instruction, I suspect it is okay to say if both operands are loop-invariants, then this cmp instruction can be determined to be a loop invariant. I added some code in computeLoopDisposition() under the case of scUnknown that executes that logic - if both operands are loop-invariants then this cmp instruction is a loop invariant. This way we can determine some cmp instructions to be loop invariants (whereas previously they were not determined to be loop invariants), and may expose opportunities for optimization. It unfortunately causes a number of regression test failures and benchmark compilation failures. The failures are from induction variable simplification, ScalarEvolution and ScalarEvolutionExpander. I am able to add fixes here and there to avoid those failures, but my concern is that it seems messy (different fixes in a number of places that address problems caused by loop-invariance of cmp instructions).

Would it be preferred to add full support for cmp instructions in SCEV, like adding and supporting an “scCMP” type in SCEVTypes?

Thanks,
Congzhe

An cmp instruction where both arguments are loop-invariant would
normally be hoisted before the loop by the LICM or GVN pass.

Support for scCMP might be useful, although I don't expect many
compare boolean exit values (that are not already hoisted by LICM)
from loops to be constant. However, support for a conditional SCEV
expression (if /select) would be nice.However, extending
ScalarEvolution can be a lot of work.

FileCheck tests are messy to update, no way around that unfortunately.

Michael