Hi,
It looks like ScalarEvolution bails out of loop backedge computation if it cannot prove the IV stride as either positive or negative (based on loop control condition). I think this logic can be refined for signed IVs.
Consider this simple loop-
void foo(int *A, int n, int s) {
int i;
for(i=0; i<n; i += s) {
A[i]++;
}
}
The IV of this loop has this SCEV form-
{0,+,%s}<%for.body>
Can someone please clarify why it is not ok to deduce the stride to be positive based on the assumption that the IV cannot have a signed underflow due to the presence of the NSW flag otherwise the program has undefined behavior?
Thanks,
Pankaj
Hi Pankaj,
> It looks like ScalarEvolution bails out of loop backedge computation if
> it cannot prove the IV stride as either positive or negative (based on
> loop control condition). I think this logic can be refined for signed IVs.
>
> Consider this simple loop-
>
> void foo(int *A, int n, int s) {
>
> int i;
>
> for(i=0; i<n; i += s) {
>
> A[i]++;
>
> }
>
> }
>
> The IV of this loop has this SCEV form-
>
> {0,+,%s}<nsw><%for.body>
This looks valid -- we already do things like this for
for (i = A; i != B; i += 5)
...
and compute the backedge taken count as "(B - A) / 5" (roughly
)
since if (B - A) is not divisible by 5 then we have UB due to
overflow. We just have to be careful around cases like:
for(i = 0; i < 60; i += s) {
may_exit();
}
"s" can be (say) -3 and the loop can take 160 backedges and then
"exit(0)", avoiding the undefined behavior due to underflow. "s" can
also be zero, in which case the loop can potentially take an infinite
number of backedges.
However, in the example you gave (written in LLVM's canonical rotated
form):
if (0 < N) {
i = 0;
do {
a[i]++;
i += s; // NSW add
} while (i < N);
}
For any s <= 0 we have undefined behavior, so it is sound to assume s
> 0.
Do you want to take a crack at fixing this? I'm traveling till the
10th of July, but I can review your change once I'm back.
-- Sanjoy
Hi Sanjoy,
Thank you for the clarification!
I will give it a try and put up the changes for review.
-Pankaj
Hi Sanjoy,
The following trivial change in howManyLessThans() seems to resolve the problem with the original loop-
// Avoid negative or zero stride values
- if (!isKnownPositive(Stride))
+ if (!NoWrap && !isKnownPositive(Stride))
return getCouldNotCompute();
However, I was experimenting with a few variants of the loop I posted and they seem to have different issues which may require more involved fixes. I am listing them here-
1) Changing the loop control condition from '<' to '<='.
The canonical form of this loop is something like this-
if (0 < N) {
i = 0;
do {
a[i]++;
i += s; // NSW add
} while (! (i > N)); // sgt compare
}
The 'sgt' compare is inverted to 'sle' for analysis. ScalarEvolution isn't really expecting 'sle' in canonicalized loops so it reverts to brute force exit count computation using computeExitCountExhaustively() which doesn't work. This looks like a canonicalization issue.
2) Variants with '>' and '>='. For example-
for(i=n; i>=0; i-=s) {
A[i]++;
}
In this case the SCEV form of IV does not have 'nsw' flag-
{%n,+,(-1 * %s)}<%for.body>
For now, I can submit a patch which fixes the issue with the original loop.
Please let me know how to proceed.
Thanks,
Pankaj
Hi Mehdi,
I actually wanted to know how to proceed with fixing all the issues I found.
I realize now that it wasn’t clear from the wording J
Thanks for the links though!
-Pankaj