SCEV cannot compute the trip count of Simple loop

Hi All,

I am trying to unroll the below loop, but couldn’t as SCEV returns TripCount as 0.

void foo(int x) {

int p, i = 1;

int mat[6][6][6];

for (p = x+3 ; p<= x+6 ;p++)

mat[p][i] = mat[p][i] + 5;

}

For a quick reference I have added the generated IR compiled with clang using –O3.

Please let me know if this is an known issue in SCEV or I am missing something here ?

; Function Attrs: nounwind readnone uwtable

define void @_Z3fooi(i32 %x) local_unnamed_addr #0 {

entry:

%mat = alloca [6 x [6 x [6 x i32]]], align 16

%0 = bitcast [6 x [6 x [6 x i32]]]* %mat to i8*

call void @llvm.lifetime.start(i64 864, i8* %0) #2

%add = add nsw i32 %x, 3

%add1 = add nsw i32 %x, 6

%idxprom3 = sext i32 %x to i64

%1 = sext i32 %add to i64

%2 = sext i32 %add1 to i64

br label %for.body

for.body: ; preds = %for.body, %entry

%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ %1, %entry ]

%arrayidx5 = getelementptr inbounds [6 x [6 x [6 x i32]]], [6 x [6 x [6 x i32]]]* %mat, i64 0, i64 %idxprom3, i64 %indvars.iv, i64 1

%3 = load i32, i32* %arrayidx5, align 4, !tbaa !1

%add6 = add nsw i32 %3, 5

store i32 %add6, i32* %arrayidx5, align 4, !tbaa !1

%indvars.iv.next = add nsw i64 %indvars.iv, 1

%cmp = icmp slt i64 %indvars.iv, %2

br i1 %cmp, label %for.body, label %for.end

for.end: ; preds = %for.body

call void @llvm.lifetime.end(i64 864, i8* nonnull %0) #2

ret void

}

Thanks,

Deepali

void foo(int x) {

int p, i = 1;

int mat[6][6][6];

for (p = x+3 ; p<= x+6 ;p++)

mat[p][i] = mat[p][i] + 5;

}

When x=0, max(p)=6, which is outside of allocated 3d array, which is UB.

I have modified the example test case for UB error, still it didn’t unroll

void foo(int x) {

int p, i = 1;

int mat[9][9][9];

for (p = (x+1) ; p < (x+3) ;p++)

mat[p-1][i] = mat[p-1][i] + 5;

}

Regard,

Deepali

There is no way to express the trip count in a closed expression.

E.g., if I call foo(MAX_INT - 2), the loop condition will evaluate to false immediately.

Best,
Philip

Are you implying that because of wrapping?
Signed integer wrapping is UB.

Hi Deepali,

SCEV reports the backedge taken count as "((-1 * (sext i32 (3 + %x) to
i64))<nsw> + ((sext i32 (3 + %x) to i64) smax (sext i32 (6 + %x) to
i64)))", so symbolically it does have an answer.

Ideally SCEV should be able to exploit <nsw> on (3 + %x) and (6 + %x)
to fold the expression above to "3", but due to some systemic issues
SCEV can't exploit <nsw> as aggressively as we should.

Without exploiting <nsw> the trip count is 2^32, which does not fit in
an 32 bit unsigned integer. This is why getSmallConstantTripCount
returns 0.

Does this answer your question?

-- Sanjoy

int mat[9][9][9];
for (p = (x+1) ; p < (x+3) ;p++)

mat[p-1][i] = mat[p-1][i] + 5;

}

The trip count of 2 should be valid for x in [0,6]. If SCEV doesn’t catch it with, say GVN and appropriate matching conditions, it could be improved.
CMIIW, I don’t think LLVM goes out of the way to opportunistically version the loop by inserting if check(s) and then unroll it.

-Kevin

int mat[9][9][9];
for (p = (x+1) ; p < (x+3) ;p++)

mat[p-1][i] = mat[p-1][i] + 5;
}
The trip count of 2 should be valid for x in [0,6].

It is not clear to me why the trip count of 2 isn’t always valid.

If SCEV doesn’t catch it with, say GVN and appropriate matching conditions, it could be improved.
CMIIW, I don’t think LLVM goes out of the way to opportunistically version the loop by inserting if check(s) and then unroll it.

We shouldn’t need any runtime check to derive the trip count here.

Hi Mehdi,

Mehdi Amini wrote:
>
>>
>> > int mat[9][9][9];
>> > for (p = (x+1) ; p < (x+3) ;p++)
>> > mat[p-1][i] = mat[p-1][i] + 5;____
>> > }
>> The trip count of 2 should be valid for x in [0,6].
>
> It is not clear to me why the trip count of 2 isn’t *always* valid.

It is always valid to return 2 as the trip count. But SCEV today is
not always able to exploit nsw/nuw due to some systemic issues.

I can go into details about the systemic issues here if you want, but
given that I've already repeated this several times on the mailing
list, I'd rather put that into a blog post to have a link to share
easily. :slight_smile:

-- Sanjoy

I understood that SCEV can’t deduce it.
The point was more “abstracting the current implementation limitation”, there is no fundamental issue with this code.

Assuming compiler assumes program is free of UB, I suppose trip count of 2 is always valid. I misplaced valid to trip count when it should’ve been “program is valid only for x in [0.6]”.

@Sanjoy, what’s the issue with ? Moreover, why can’t SCEV detect that x is loop invariant and just take the diff/incr?

Hi Kevin,

Kevin Choi wrote:
> @Sanjoy, what's the issue with <nsw>?

I'll write up something in a central location and share it.

> Moreover, why can't SCEV detect
> that x is loop invariant and just take the diff/incr?

Detecting the loop invariance of x is not the problem. SCEV currently
stumbles on proving that "x + 1" and "x + 3" don't overflow.

If "x + 1" and "x + 3" cannot be proven to overflow then the loop no
longer has a constant trip count of "2". For instance if "x" is
INT_MAX-1 then the loop will run 0 times.

-- Sanjoy

Hi Kevin,

If "x + 1" and "x + 3" cannot be proven to overflow then the loop no
longer has a constant trip count of "2".

If we apply "Assuming compiler assumes program is free of UB", can above
requirement be relaxed?

-Kevin

Thanks Sanjoy for the clarification. Can you please share the link to blog for systemic issues in SCEV that you were mentioning.

Thanks,
Deepali

Hi Kevin,

Kevin Choi <code.kchoi@gmail.com> writes:

Hi Kevin,

If "x + 1" and "x + 3" cannot be proven to overflow then the loop no
longer has a constant trip count of "2".

Firstly: I made a mistake above. What I meant to say is If "x + 1" and
"x + 3" cannot be proven to not overflow (that is, they may overflow)
then the loop no longer has a constant trip count of "2".

If we apply "Assuming compiler assumes program is free of UB", can
above requirement be relaxed?

Yes, the problem is coming up with a sound way to exploit the assumption
that the program does not have UB.

-- Sanjoy

Hi Deepali,

Rai, Deepali wrote:

> Thanks Sanjoy for the clarification. Can you please share the link
> to blog for systemic issues in SCEV that you were mentioning.

I didn't write a blog post, but I found a previous instance of me
answering the same question on the mailing list:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160613/366023.html

Thanks,
-- Sanjoy