confusion w.r.t. scalar evolution and nuw

I've been doing some digging in this area (scev, wrapping arithmetic),
learning as much as I can, and have reached a point where I'm fairly
confused about the semantics of nuw in scalar evolution expressions.
Consider the following program:

define void @foo(i32 %begin) {
entry:
  br label %loop

loop:
  %idx = phi i32 [ %begin, %entry ], [ %idx.dec, %loop ]
  %idx.dec = sub nuw i32 %idx, 1
  %exit.condition = icmp eq i32 %idx.dec, 0
  br i1 %exit.condition, label %loop.exit, label %loop

loop.exit:
  ret void
}

As long as %begin is positive, %idx.dec is not poison and the function
has defined behavior.

When I run opt -analyze -scalar-evolution on the IR, I get

Printing analysis 'Scalar Evolution Analysis' for function 'foo':
Classifying expressions for: @foo
  %idx = phi i32 [ %begin, %entry ], [ %idx.dec, %loop ]
  --> {%begin,+,-1}<nuw><%loop> Exits: 1
  %idx.dec = sub nuw i32 %idx, 1
  --> {(-1 + %begin),+,-1}<nuw><%loop> Exits: 0
Determining loop execution counts for: @foo
Loop %loop: backedge-taken count is (-1 + %begin)
Loop %loop: max backedge-taken count is -1

SCEV has mapped `%idx.dec' to `{(-1 + %begin),+,-1}<nuw>'! This means
SCEV thinks `%idx.dec' will be poison in the very first iteration if
%begin is ult 1.

As an example on how this could blow up, if %begin was constant
(or LLVM was got smarter about ranges someday and learned to handle
non-constant ranges) in ScalarEvolution::getUnsignedRange, the
following code could fire:

  if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(S)) {
    // If there's no unsigned wrap, the value will never be less than its
    // initial value.
    if (AddRec->getNoWrapFlags(SCEV::FlagNUW))
      if (const SCEVConstant *C = dyn_cast<SCEVConstant>(AddRec->getStart()))
        if (!C->getValue()->isZero())
          ConservativeResult =
            ConservativeResult.intersectWith(
              ConstantRange(C->getValue()->getValue(), APInt(BitWidth, 0)));

and conclude that %idx.dec is always UGE %begin. This is a real bug
waiting to happen.

As far as I can tell, in the presence of no-wrap flags, 'sub X Y' is
not the same as 'add X -Y', and subtractions can be expressed as
additions only if they don't have any no-wrap flags on them. While
I have not thought about nsw in depth, but it is possible there are
similar issues with nsw as well.

Is my reasoning broken at some point? What gives?

-- Sanjoy

Seems like it’s a bug.

We are permitted to turn ‘sub nsw i32 %x, 1’ into ‘add nsw i32 %x, -1’

We are not permitted to turn ‘sub nuw i32 %x, 1’ into ‘add nuw i32 %x, -1’

I don’t think we are permitted to keep the nuw in your example. We would need something like SCEVSubRecExpr to be able to notionally represent that.

We are permitted to turn 'sub nsw i32 %x, 1' into 'add nsw i32 %x, -1'

nsw also has the same problem:

sub nsw int_min, int_min is 0 but
add nsw int_min, (-int_min) is poison

-- Sanjoy

> We are permitted to turn 'sub nsw i32 %x, 1' into 'add nsw i32 %x, -1'

nsw also has the same problem:

sub nsw int_min, int_min is 0 but
add nsw int_min, (-int_min) is poison

Right, it doesn't work with all values. I was specifically referring to
the %x and -1 case.

Here is a small program that illustrates the bug:

define void @f(i32* %loc) {
entry:
  br label %loop

loop:
  %idx = phi i32 [ 6, %entry ], [ %idx.dec, %loop ]
  store i32 %idx, i32* %loc
  %idx.dec = sub nuw i32 %idx, 1
  %cond = icmp uge i32 %idx.dec, 5
  br i1 %cond, label %loop, label %exit

exit:
  ret void
}

The store is supposed to happen exactly twice. But opt -indvars turns
it into an infinite loop:

define void @f(i32* %loc) {
entry:
  br label %loop

loop: ; preds = %loop, %entry
  %idx = phi i32 [ 6, %entry ], [ %idx.dec, %loop ]
  store i32 %idx, i32* %loc
  %idx.dec = sub nuw nsw i32 %idx, 1
  br i1 true, label %loop, label %exit

exit: ; preds = %loop
  ret void
}

because %idx.dec, which is {5,+,-1}<nuw><%loop> can never be ult 5.

It looks like this change was incorrect:

@@ -3102,6 +3102,12 @@ const SCEV *ScalarEvolution::createNodeForPHI(PHINode *PN) {
                   if (isKnownPositive(getMinusSCEV(getSCEV(GEP), Ptr)))
                     Flags = setFlags(Flags, SCEV::FlagNUW);
                 }
+ } else if (const SubOperator *OBO =
+ dyn_cast<SubOperator>(BEValueV)) {
+ if (OBO->hasNoUnsignedWrap())
+ Flags = setFlags(Flags, SCEV::FlagNUW);
+ if (OBO->hasNoSignedWrap())
+ Flags = setFlags(Flags, SCEV::FlagNSW);

I missed this during review, possibly because it was a small part of a larger patch. I should have caught it because I'm well aware of exactly the same problem that occurs with IR canonicalization in Reassociate.

To answer your question about general semantics, nsw/nuw flags are fundamentally unsound in SCEV. They should only be used in very specific IR patterns where we can prove safety (phi-add-phi). If those patterns weren't so important, we could just eliminate them and have a clean design. The hardest part about this is developers repeatedly want to improve SCEV by handling more nsw/nuw awareness. Naturally, there's a tendancy to build on the existing logic. Those improvements are likely wrong in subtle ways, and the burden is on a reviewer to spot a problem with it.

To improve nsw/nuw awareness, we need a different approach. I think we should have some IR-level analysis of nsw/nuw to augment trip count computation. At the moment, you're the best person to propose an improved design. I'm looking forward to seeing it :wink:

-Andy

Hi,

I missed this during review, possibly because it was a small part of a
larger patch. I should have caught it because I'm well aware of
exactly the same problem that occurs with IR canonicalization in
Reassociate.

Thanks for the tip. Now that I have a test case, I'll put up a fix
for review soon. Do you think it makes sense to also change
ScalarEvolution::getMinuSCEV to throw away the no-wrap flags? That is
another place where we make the (X - Y) == (X + (-Y)) judgment.

To improve nsw/nuw awareness, we need a different approach. I think we
should have some IR-level analysis of nsw/nuw to augment trip count
computation.

That sounds interesting -- I don't have any well-formed ideas right
now, but I will try to come up with something. One way to go about
this (that I've been thinking about in other contexts) is to annotate
scev expressions with arbitrary (not necessarily constant) ranges.
Since poison is UB only if it is used, we will have to have something
like getSCEVAtScope that gives us an "range-annotate SCEV" that is
valid at a given scope (i.e. given that we are about to execute an
instruction, what is the allowable range for a scev). In this scheme,
a getSCEVAtScope(value = x, scope = (icmp (add nuw x, y) foo)) gives
us an scev for x annotated with the range [0, uint_max -
getSCEVAtScope(y, (icmp (add nuw x, y) foo))). The nice thing is that
once we can represent general range information in the SCEV framework,
we can use it for things other than proving away overflow.

-- Sanjoy