SCEV and <nsw>

Alexandre_Isoard · October 26, 2018, 5:41pm

Hello,

I’m running into an issue where SCEV misses explicit nsw flags on the following expressions:

%mm.07 = phi i32 [ 0, %for.body.lr.ph ], [ %add, %for.inc30 ]
→ {0,+,1}<%for.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + %m) LoopDispositions: { %for.body: Computable, %for.body5: Invariant, %for.inc: Invariant }
%add = add nsw i32 %mm.07, 1
→ {1,+,1}<%for.body> U: [1,-2147483648) S: [1,-2147483648) Exits: %m LoopDispositions: { %for.body: Computable, %for.body5: Invariant, %for.inc: Invariant }
%add2 = add nsw i32 %mm.07, 96
→ {96,+,1}<%for.body> U: [96,-2147483553) S: [96,-2147483553) Exits: (95 + %m) LoopDispositions: { %for.body: Computable, %for.body5: Invariant, %for.inc: Invariant }

My problem is with the later one, where the is missing (which cause me problems down the line with gep computation on 64 bit address space).

Any clue as to what could be the source of that disappearance? I tried to reproduce the issue on simple cases but to no avail. I get the following expected result:

%i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
→ {0,+,1}<%loop.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + %x) LoopDispositions: { %loop.body: Computable }
%i.next = add nsw i32 %i, 1
→ {1,+,1}<%loop.body> U: [1,-2147483648) S: [1,-2147483648) Exits: %x LoopDispositions: { %loop.body: Computable }
%i.96 = add nsw i32 %i, 96
→ {96,+,1}<%loop.body> U: [96,-2147483648) S: [96,-2147483648) Exits: (95 + %x) LoopDispositions: { %loop.body: Computable }

Help?

Alexandre_Isoard · October 26, 2018, 6:12pm

I finally managed to reproduce the issue:

define void @bad(i32* %A, i32 %x) {

entry:
%do = icmp slt i32 0, %x
br i1 %do, label %loop.prehead, label %exit

loop.prehead: ; preds = %entry
br label %loop.body

loop.body: ; preds = %loop.body, %loop.prehead
%i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
%i.next = add nsw i32 %i, 1
%i.96 = add nsw i32 %i, 96
%store.addr = getelementptr i32, i32* %A, i32 %i.next
store i32 %i, i32* %store.addr, align 4
%cmp = icmp slt i32 %i.next, %x
br i1 %cmp, label %loop.body, label %loop.exit

loop.exit: ; preds = %loop.body
br label %exit

exit: ; preds = %loop.exit, %entry
ret void
}

; Printing analysis ‘Scalar Evolution Analysis’ for function ‘bad’:
; Classifying expressions for: @bad
; %i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
; → {0,+,1}<%loop.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + %x) LoopDispositions: { %loop.body: Computable }
; %i.next = add nsw i32 %i, 1
; → {1,+,1}<%loop.body> U: [1,-2147483648) S: [1,-2147483648) Exits: %x LoopDispositions: { %loop.body: Computable }
; %i.96 = add nsw i32 %i, 96
; → {96,+,1}<%loop.body> U: [96,-2147483553) S: [96,-2147483553) Exits: (95 + %x) LoopDispositions: { %loop.body: Computable }
; %store.addr = getelementptr i32, i32* %A, i32 %i.next
; → {(8 + %A),+,8}<%loop.body> U: full-set S: full-set Exits: (8 + (8 * (zext i32 (-1 + %x) to i64)) + %A) LoopDispositions: { %loop.body: Computable }
; Determining loop execution counts for: @bad
; Loop %loop.body: backedge-taken count is (-1 + %x)
; Loop %loop.body: max backedge-taken count is 2147483646
; Loop %loop.body: Predicated backedge-taken count is (-1 + %x)
; Predicates:
;
; Loop %loop.body: Trip multiple is 1

define void @good(i32* %A, i32 %x) {
entry:
%do = icmp slt i32 0, %x
br i1 %do, label %loop.prehead, label %exit

loop.prehead: ; preds = %entry
br label %loop.body

loop.body: ; preds = %loop.body, %loop.prehead
%i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
%i.next = add nsw i32 %i, 1
%i.96 = add nsw i32 %i, 96
%store.addr = getelementptr i32, i32* %A, i32 %i.96
store i32 %i, i32* %store.addr, align 4
%cmp = icmp slt i32 %i.next, %x
br i1 %cmp, label %loop.body, label %loop.exit

loop.exit: ; preds = %loop.body
br label %exit

exit: ; preds = %loop.exit, %entry
ret void
}

; Printing analysis ‘Scalar Evolution Analysis’ for function ‘good’:
; Classifying expressions for: @good
; %i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
; → {0,+,1}<%loop.body> U: [0,2147483647) S: [0,2147483647) Exits: (-1 + %x) LoopDispositions: { %loop.body: Computable }
; %i.next = add nsw i32 %i, 1
; → {1,+,1}<%loop.body> U: [1,-2147483648) S: [1,-2147483648) Exits: %x LoopDispositions: { %loop.body: Computable }
; %i.96 = add nsw i32 %i, 96
; → {96,+,1}<%loop.body> U: [96,-2147483648) S: [96,-2147483648) Exits: (95 + %x) LoopDispositions: { %loop.body: Computable }
; %store.addr = getelementptr i32, i32* %A, i32 %i.96
; → {(768 + %A),+,8}<%loop.body> U: full-set S: full-set Exits: (768 + (8 * (zext i32 (-1 + %x) to i64)) + %A) LoopDispositions: { %loop.body: Computable }
; Determining loop execution counts for: @good
; Loop %loop.body: backedge-taken count is (-1 + %x)
; Loop %loop.body: max backedge-taken count is 2147483646
; Loop %loop.body: Predicated backedge-taken count is (-1 + %x)
; Predicates:
;
; Loop %loop.body: Trip multiple is 1

Notice the only difference is on the gep (in bold). Can somebody shed some light?

I attached the test file.

test.ll (3.87 KB)

Sanjoy_Das1 · October 30, 2018, 5:52am

The problem here is that SCEV expressions are keyed on their operands
but not on the no-wrap flags. So if we have

  %i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
  %i.96 = add nsw i32 %i, 96
  %i.96.wrapping = add i32 %i, 96

then both %i.96 and %i.96.wrapping will be mapped to the SCEV*. Ergo
that common SCEV* cannot have the NSW bit set.

Now if the IR is actually

  %i = phi i32 [ 0, %loop.prehead ], [ %i.next, %loop.body ]
  %i.96 = add nsw i32 %i, 96
  %i.96.wrapping = add i32 %i, 96
  %store.addr = getelementptr i32, i32* %A, i32 %i.96
  store i32 %i, i32* %store.addr, align 4

then if %i.96 overflows the program has undefined behavior since we'll
try to store *to* poison. So in this case we can map both %i.96 and
%i.96.wrapping to a NSW SCEV*

Does that help?
-- Sanjoy

Alexandre_Isoard · October 31, 2018, 10:25pm

Thanks.

So if I understand correctly, this works because we deduce a property that is guaranteed by the fact that else a poison value (thanks to ) would become observable (thanks to the store)? In the code base, where would that info taken into consideration?

My issue is that I have a custom intrinsic that take an address as an argument (and I consider doing so guarantee the address is valid, as I actually preload it). How would I add semantic so that SCEV do the same kind of reasoning as for a store?

I don’t see where that dark magic is performed.

Sanjoy_Das1 · November 1, 2018, 2:27am

The magic happens in ScalarEvolution::isSCEVExprNeverPoison. You
should be able to add your special intrinsic to
llvm::getGuaranteedNonFullPoisonOp for SCEV to pick it up.

-- Sanjoy

Alexandre_Isoard · November 1, 2018, 2:36am

Thanks a lot!

Topic		Replies	Views
SCEV and GEP NSW flag LLVM Dev List Archives	4	90	November 9, 2013
Improving SCEV's behavior around IR level no-wrap flags LLVM Dev List Archives	13	106	September 24, 2016
Correctness of nsw/nuw and sext/zext in SCEV LLVM Dev List Archives	0	65	January 24, 2020
sign extensions, SCEVs, and wrap flags LLVM Dev List Archives	7	121	September 20, 2012
SCEV getMulExpr() not propagating Wrap flags LLVM Dev List Archives	4	87	November 16, 2013

SCEV and <nsw>

Related Topics