inconsistent wording in the LangRef regarding "shl nsw"

The LangRef says this for left shifts:

"If the nsw keyword is present, then the shift produces a poison value
if it shifts out any bits that disagree with the resultant sign bit."
... (1)

followed by

"As such, NUW/NSW have the same semantics as they would if the shift
were expressed as a mul instruction with the same nsw/nuw bits in (mul
%op1, (shl 1, %op2))." ... (2)

But by (1) "shl i8 1, i8 7" sign overflows (since it shifts out only
zeros, but the result has the sign bit set) but "mul i8 1, i8 -128"
does not sign overflow (by the usual definition of sign-overflow), so
this violates (2).

InstCombine already has a check for this edge-case when folding
multiplies into left shifts:

...
        if (I.hasNoUnsignedWrap())
          Shl->setHasNoUnsignedWrap();
        if (I.hasNoSignedWrap() && *** NewCst->isNotMinSignedValue() ***)
          Shl->setHasNoSignedWrap();
...

(though the check is a bit weird -- NewCst is Log2(IVal) so I think it
cannot ever be INT_MIN, and I suspect that the check is supposed to be
"IVal->isNotMinSignedValue()" or equivalent.)

Should one of the two clauses be removed from the LangRef? I'd prefer
keeping (2) and removing (1) unless it inhibits some important
optimization, solely because (2) is easier to reason in.

Note1: neither clause is stronger than the other -- "shl i8 -1, 7"
sign-overflows by (2) but not by (1).

Note2: there may be similar issues with nuw; I have not investigated that yet.

-- Sanjoy

The LangRef says this for left shifts:

"If the nsw keyword is present, then the shift produces a poison value
if it shifts out any bits that disagree with the resultant sign bit."
... (1)

followed by

"As such, NUW/NSW have the same semantics as they would if the shift
were expressed as a mul instruction with the same nsw/nuw bits in (mul
%op1, (shl 1, %op2))." ... (2)

But by (1) "shl i8 1, i8 7" sign overflows (since it shifts out only
zeros, but the result has the sign bit set) but "mul i8 1, i8 -128"
does not sign overflow (by the usual definition of sign-overflow), so
this violates (2).

InstCombine already has a check for this edge-case when folding
multiplies into left shifts:

...
        if (I.hasNoUnsignedWrap())
          Shl->setHasNoUnsignedWrap();
        if (I.hasNoSignedWrap() && *** NewCst->isNotMinSignedValue() ***)
          Shl->setHasNoSignedWrap();
...

(though the check is a bit weird -- NewCst is Log2(IVal) so I think it
cannot ever be INT_MIN, and I suspect that the check is supposed to be
"IVal->isNotMinSignedValue()" or equivalent.)

Yes, it looks like that check is a bit off. I think this does the trick
though:
diff --git a/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
b/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
index 35513f1..a554e9f 100644
--- a/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -217,12 +217,16 @@ Instruction *InstCombiner::visitMul(BinaryOperator
&I) {
         NewCst = getLogBase2Vector(CV);

       if (NewCst) {
+ unsigned Width = NewCst->getType()->getPrimitiveSizeInBits();
         BinaryOperator *Shl = BinaryOperator::CreateShl(NewOp, NewCst);

         if (I.hasNoUnsignedWrap())
           Shl->setHasNoUnsignedWrap();
- if (I.hasNoSignedWrap() && NewCst->isNotMinSignedValue())
- Shl->setHasNoSignedWrap();
+ if (I.hasNoSignedWrap()) {
+ uint64_t V;
+ if (match(NewCst, m_ConstantInt(V)) && V != Width - 1)
+ Shl->setHasNoSignedWrap();
+ }

         return Shl;
       }

Should one of the two clauses be removed from the LangRef? I'd prefer
keeping (2) and removing (1) unless it inhibits some important
optimization, solely because (2) is easier to reason in.

C11 seems to say that "1 << 31" results in overflow but C++11 disagrees
because it only requires the result to be representable in the
corresponding *unsigned* type of the result type. Annoying.

If I had to guess, the intent of (1) was to make "1 << 31" UB.

I'm of the opinion that the C++11 rules are more sane, "1 << 31" should not
result in overflow just like "1 * INT_MIN" doesn't. We should try to
switch to (2).

I wouldn't be surprised if InstSimplify were relying on (1) to implement
some of its optimizations:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=233938&view=markup#l2298

Note1: neither clause is stronger than the other -- "shl i8 -1, 7"
sign-overflows by (2) but not by (1).

Both C and C++ would say that this is UB. Again, (2) sounds better.

Note2: there may be similar issues with nuw; I have not investigated that
yet.

FWIW, I don't see any issues with the nuw side.

I wouldn't be surprised if InstSimplify were relying on (1) to implement
some of its optimizations:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=233938&view=markup#l2298

Yup! It looks like LLVM really implements (1) and (2) is just a
misleading anecdote. So I'll change my vote from "remove (1)" to
"remove (2)". :slight_smile:

-- Sanjoy

> I wouldn't be surprised if InstSimplify were relying on (1) to implement
> some of its optimizations:
>
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/InstructionSimplify.cpp?revision=233938&view=markup#l2298

Yup! It looks like LLVM really implements (1) and (2) is just a
misleading anecdote. So I'll change my vote from "remove (1)" to
"remove (2)". :slight_smile:

I wrote those optimizations with an understanding that (1) was the one true
way. Would you be apposed to switching to (2)? I have examples where
implementing (1) creates real regressions where (2) does not.

I wrote those optimizations with an understanding that (1) was the one true
way. Would you be apposed to switching to (2)? I have examples where
implementing (1) creates real regressions where (2) does not.

I don't have fundamental reasons to favor (1) over (2). I assumed
finding and fixing all the places within LLVM that implement and
assume (1) will be difficult and bug-prone, and therefore sticking
with (1) is the more pragmatic option. I do think (2) is easier to
reason in, so if you think switching LLVM to consistently use (2) is
doable then I too will prefer (2).

In any case we should let this thread linger for a while before
changing anything, in case there are others with differing opinions.

-- Sanjoy