Hi,
I’m investigating the strange behaviour of unrolling loops in the form of for (i = x - Left; i <= x + Right; ++i)
. In particular, this example: Compiler Explorer can’t be fully unrolled on 3 iterations, but this one: Compiler Explorer (the only difference is a latch condition i <= x + 2) is unrolled on 4 iterations. This templated test: Compiler Explorer covers some possbile combinations of Left and Right constants (Left <= Right), and the results are in the table:
-2 | -1 | 0 | 1 | 2 | |
---|---|---|---|---|---|
-2 | Unrolled | Unrolled | Unrolled | Not unrolled | Unrolled |
-1 | - | Unrolled | Unrolled | Not unrolled | Unrolled |
0 | - | - | Unrolled | Unrolled | Unrolled* |
1 | - | - | - | Unrolled | Unrolled |
2 | - | - | - | - | Unrolled |
Unrolled* – for Left=0, Right>=2 the loop is unrolled, but the code contains excess run-time overflow check (icmp slt i32 %0, 2147483646); this is related to IndVarSimplify pass and can be fixed by -mllvm -disable-lftr=true
(I will not discuss such case further in this topic, but maybe it should be fixed at some point, too)
The problem with unrolling in case of Left<=-1, Right=1 is that before LoopFullUnrollPass InstCombine changes the latch condition: Compiler Explorer. Let’s consider 2 examples: condition (i + 1 <= x + 1) and (i + 1 <= x + 2):
Case 1:
icmp sle (add nsw i, 1), (add nsw x, 1)
==>
icmp slt i, (add nsw x, 1)
==>
icmp sle i, x
Case 2:
icmp sle (add nsw i, 1), (add nsw x, 2)
==>
icmp slt i, (add nsw x, 2)
==> (not done in practice, because not profitable)
icmp sle i, (add nsw x, 1)
When we have a latch condition in the form of icmp slt i, (add nsw x, 2)
, the loop can be prooven to be finite (due to strict comparision predicate), so the loop can be unrolled (trip count is known). But in the first case, we actually can’t make such assumption after InstCombine transform, because loop with condition (i <= x) can be infinite when x=INT_MAX. This shows the irreversibility of this InstCombine transform: in the original loop, add nsw x, 1
will produce poison for x=INT_MAX, and all comparision will be poisonous, too, so we will have undefined behaviour in this case. But after the transformation, the loop behaviour is more defined: it will be infinite in case of x=INT_MAX, and it blocks full unrolling here. So I have a couple of questions here:
- Since this transformation is irreversible (alive proof: Compiler Explorer), what’s the correct way to deal with it? I believe it can’t be removed because in general it looks reasonable, and InstCombineCompares have multiple of such irreversible transforms.
- Would it be more correct to preserve information about loop finity beforehand? (e.g. as some loop metadata).