RFC: Proposal for Poison Semantics

Hello,

What follows is my attempt to describe how poison works. Let me know what you think.

The last time we discussed this, we were considering poison a property of
an individual bit much like undef is a property of an indivdual bit. Are
you no longer proposing those semantics? That is, while %or's second bit is
clearly not poison, is the first bit (or LSB) poison?

Hi David,

I spent some time thinking about poison semantics this way, but here
is where I always get stuck:

Consider the IR fragment

  %x = zext i32 %maybe_poison to i64
  %y = lshr i64 %x 32
  %ptr = gep %global, %y
  store 42 to %ptr

If %maybe_poison is poison, then is %y poison? For all i32 values of
%maybe_poison, %y is i64 0, so in some sense you can determine the
value %y without looking at %x. Given that, the store in the above
fragment is not undefined behavior even if %maybe_poison is poison.
However, this means if "%maybe_poison" is "add nuw %m, %n" it cannot
be optimized to "add nuw (zext %m) (zext %n)" since that will change
program behavior in an otherwise well-defined program.

One way out of this is to say that zext and sext of a value that is
dependent on poison is poison. We'll have to do something similar for
load shearing.

The above "problem" can also be seen in

### Does comparing a poison value result in poison?

It depends. If the comparison couldn't solely be determined by
looking at the other operand, the result is poison.

This means we cannot do the C-style optimization "int a = ...; a <
(a + 1)" ==> "true". In the case "a + 1" overflows, a is
INT_SIGNED_MAX. But if a is INT_SIGNED_MAX, "a < X" is always false,
for all values of X. So "a < a + 1" is defined; and it is true for "a
!= INT_SIGNED_MAX" but false for "a == INT_SIGNED_MAX". Hence the
expression cannot be folded to true.

I think the reason why poison is hard to formalize is that it
fundamentally tries to give an N bit value behavior that cannot be
justified by _any_ N bit value. It "breaks" llvm's type system.

-- Sanjoy

Sorry this wasn't clear. My intent was to say that bit 1 was set but that
the other bits were poison. With this in mind, it is legal to throw the
poison away if you don't want to keep the or instruction if you replace it
with a value which has bit 1 set (these semantics are just like undef).

Hi David,

I spent some time thinking about poison semantics this way, but here
is where I always get stuck:

Consider the IR fragment

  %x = zext i32 %maybe_poison to i64
  %y = lshr i64 %x 32
  %ptr = gep %global, %y
  store 42 to %ptr

If %maybe_poison is poison, then is %y poison? For all i32 values of
%maybe_poison, %y is i64 0, so in some sense you can determine the
value %y without looking at %x.

I agree, this makes sense.

Given that, the store in the above
fragment is not undefined behavior even if %maybe_poison is poison.
However, this means if "%maybe_poison" is "add nuw %m, %n" it cannot
be optimized to "add nuw (zext %m) (zext %n)" since that will change
program behavior in an otherwise well-defined program.

Hmm, I'm not so sure this is right.

Are we talking about transforming:
%add = add nuw i32 %x, %y
%res = zext i32 %add to i64

to:
%z1 = zext i32 %x to i64
%z2 = zext i32 %y to i64
%res = add nuw i64 %z1, %z2

?

This is OK because performing a zext by that many bits cannot result in a
NUW violation.

One way out of this is to say that zext and sext of a value that is
dependent on poison is poison. We'll have to do something similar for
load shearing.

sext must be dependent on the underlying value because it splats the sign
bit.

The above "problem" can also be seen in

> ### Does comparing a poison value result in poison?
>
> It depends. If the comparison couldn't solely be determined by
> looking at the other operand, the result is poison.

This means we cannot do the C-style optimization "int a = ...; a <
(a + 1)" ==> "true". In the case "a + 1" overflows, a is
INT_SIGNED_MAX. But if a is INT_SIGNED_MAX, "a < X" is always false,
for all values of X. So "a < a + 1" is defined; and it is true for "a
!= INT_SIGNED_MAX" but false for "a == INT_SIGNED_MAX". Hence the
expression cannot be folded to true.

This part sounds tricky, I'm not immediately sure what to do.

However, this means if "%maybe_poison" is "add nuw %m, %n" it cannot
be optimized to "add nuw (zext %m) (zext %n)" since that will change

I should have been clearer here: "it" == "%x" -- IOW you cannot the
usual zext of sum == sum of zexts transform.

This means we cannot do the C-style optimization "int a = ...; a <
(a + 1)" ==> "true". In the case "a + 1" overflows, a is
INT_SIGNED_MAX. But if a is INT_SIGNED_MAX, "a < X" is always false,

And also a slight clarification here -- "a + 1" sign-overflows if and
only if a is INT_SIGNED_MAX. < is signed less than. And if a is not
poison, "a < a + 1" is not poison because precisely in the case where
"a + 1" is poison, we can determine the result of the compare without
looking at "a + 1".

Hi David,

I spent some time thinking about poison semantics this way, but here
is where I always get stuck:

Consider the IR fragment

  %x = zext i32 %maybe_poison to i64
  %y = lshr i64 %x 32
  %ptr = gep %global, %y
  store 42 to %ptr

If %maybe_poison is poison, then is %y poison? For all i32 values of
%maybe_poison, %y is i64 0, so in some sense you can determine the
value %y without looking at %x.

I agree, this makes sense.

Given that, the store in the above
fragment is not undefined behavior even if %maybe_poison is poison.
However, this means if "%maybe_poison" is "add nuw %m, %n" it cannot
be optimized to "add nuw (zext %m) (zext %n)" since that will change
program behavior in an otherwise well-defined program.

Hmm, I'm not so sure this is right.

Are we talking about transforming:
%add = add nuw i32 %x, %y
%res = zext i32 %add to i64

to:
%z1 = zext i32 %x to i64
%z2 = zext i32 %y to i64
%res = add nuw i64 %z1, %z2

?

This is OK because performing a zext by that many bits cannot result in a
NUW violation.

No, I'm talking about "zext(add nuw X Y)" ==> "add nuw (zext X) (zext
Y)". In the example, replacing %x, a zext of an nuw add with an nuw
add of zexts changes the behavior of a well-defined program.

sext must be dependent on the underlying value because it splats the sign
bit.

Right, which is why I initially chose zext. :slight_smile:

But with sexts with you a similar problem:

  %t = add nsw <known positive>, <known positive>
  %t1 = and %t, 0x7fff...
  %t2 = sext %t1
  %t3 = lshr %t2, <width of typeof %t>
  store_to (gep %global, %t2)

Now %t3 is always zero and this program is well-defined if even if %t
is poison. However, I cannot reason "%t never overflows, so it is
always >0 and hence %t1 == %t" since substituting %t1 with %t will
change the meaning of a well-defined program.

-- Sanjoy

>>
>> Hi David,
>>
>> I spent some time thinking about poison semantics this way, but here
>> is where I always get stuck:
>>
>> Consider the IR fragment
>>
>> %x = zext i32 %maybe_poison to i64
>> %y = lshr i64 %x 32
>> %ptr = gep %global, %y
>> store 42 to %ptr
>>
>> If %maybe_poison is poison, then is %y poison? For all i32 values of
>> %maybe_poison, %y is i64 0, so in some sense you can determine the
>> value %y without looking at %x.
>
>
> I agree, this makes sense.
>
>>
>> Given that, the store in the above
>> fragment is not undefined behavior even if %maybe_poison is poison.
>> However, this means if "%maybe_poison" is "add nuw %m, %n" it cannot
>> be optimized to "add nuw (zext %m) (zext %n)" since that will change
>> program behavior in an otherwise well-defined program.
>
>
> Hmm, I'm not so sure this is right.
>
> Are we talking about transforming:
> %add = add nuw i32 %x, %y
> %res = zext i32 %add to i64
>
> to:
> %z1 = zext i32 %x to i64
> %z2 = zext i32 %y to i64
> %res = add nuw i64 %z1, %z2
>
> ?
>
> This is OK because performing a zext by that many bits cannot result in a
> NUW violation.

No, I'm talking about "zext(add nuw X Y)" ==> "add nuw (zext X) (zext
Y)". In the example, replacing %x, a zext of an nuw add with an nuw
add of zexts changes the behavior of a well-defined program.

Correct me if I am wrong but we are talking about transforming:
   %maybe_poison = add nuw i32 %a, %b
   %x = zext i32 %maybe_poison to i64
   %y = lshr i64 %x 32

To:
   %za = zext i32 %a to i64
   %zb = zext i32 %b to i64
   %x = add nuw i64 %za, %zb
   %y = lshr i64 %x 32

?

If so, this seems fine in the model given by the RFC.

In the before case:
%maybe_poison doesn't infect the upper bits of %x with its poison which
allows %y to have the well defined result 0.

In the after case:
%za and %zb will have their top 32-bits filled with zeros making it
impossible for nuw to be violated for %x, %y would have the well defined
result 0.

Correct me if I am wrong but we are talking about transforming:
   %maybe_poison = add nuw i32 %a, %b
   %x = zext i32 %maybe_poison to i64
   %y = lshr i64 %x 32

To:
   %za = zext i32 %a to i64
   %zb = zext i32 %b to i64
   %x = add nuw i64 %za, %zb
   %y = lshr i64 %x 32

?

If so, this seems fine in the model given by the RFC.

In the before case:
%maybe_poison doesn't infect the upper bits of %x with its poison which
allows %y to have the well defined result 0.

In the after case:
%za and %zb will have their top 32-bits filled with zeros making it
impossible for nuw to be violated for %x, %y would have the well defined
result 0.

If both %a and %b are 2^32-1, won't %y be 0 in the first case and 1 in
the second? The reason why we can get away with this _now_ is because
the first program has UB and hence anything can happen, including
storing to &a[1] instead &a[0]. But with the semantics in the RFC,
the first program is well-defined and is storing to &a[0] so we cannot
change it to store to &a[1] instead.

-- Sanjoy

Ah, yes. You are right, we cannot always assume that %y would be zero in
the second case.
This wouldn't be the first time we've lost information that we could use to
optimize a program by transforming it.

Do you think this result would be problematic? It seems consistent with
the RFC and LLVM's current behavior.

Ah, yes. You are right, we cannot always assume that %y would be zero in
the second case.
This wouldn't be the first time we've lost information that we could use to
optimize a program by transforming it.

Do you think this result would be problematic? It seems consistent with the
RFC and LLVM's current behavior.

The problem is not that we're losing information, the problem is that
we're changing the behavior of a well-defined program.

I'll try to put the whole argument in one place:

We start with

  %x = add nuw i32 %m, %n
  %y = zext i32 %x to i64
  %s = lshr i64 %y, 32
  %addr = gep %some_global, %s
  store i32 42, i32* %addr

In the above program, for all values of %x, %s is 0. This means the
program is well-defined when %x is poison (since you don't need to
look at %x to determine the value of %addr, in the same sense as you
don't need to look at X to determine the value of "and X, 0"); and it
stores 42 to &(%some_global)[0]. Specifically, the above program is
well defined for "%m = %n = 2^32-1".

Now if we do the usual transform of "zext (add nuw X Y)" => "add nuw
(zext X) (zext Y)" then we get

  %m.wide = zext i32 %m to i64
  %n.wide = zext i32 %n to i64
  %z = add nuw i64 %m.wide, %n.wide
  %s = lshr i64 %y, 32
  %addr = gep %some_global, %s
  store i32 42, i32* %addr

The new program does *not* have the same behavior as the old program
for "%m = %n = 2^32-1". We have changed the behavior of a
well-defined program by doing the "zext (add nuw X Y)" => "add nuw
(zext X) (zext Y)" transform.

-- Sanjoy

Now if we do the usual transform of "zext (add nuw X Y)" => "add nuw
(zext X) (zext Y)" then we get

  %m.wide = zext i32 %m to i64
  %n.wide = zext i32 %n to i64
  %z = add nuw i64 %m.wide, %n.wide
  %s = lshr i64 %y, 32
  %addr = gep %some_global, %s
  store i32 42, i32* %addr

I made a typo here: the lshr should be "%s = lshr i64 %z, 32". Its
value will be 1 for %m = %n = 2^32-1, and the transformed program will
store 42 to &(%some_global)[1].

-- Sanjoy

I don't follow this last sentence.

if the definition of NUW is that zext-ing the result is equivalent to
zext-ing the inputs and doing the operation at a higher bitwidth (my
understanding), then %m and %n cannot hold those values, that would violate
the NUW flag.

Either my understanding of th edefinition holds, or the transformation you
are proposing isn't valid IMO.

if the definition of NUW is that zext-ing the result is equivalent to
zext-ing the inputs and doing the operation at a higher bitwidth (my
understanding), then %m and %n cannot hold those values, that would violate
the NUW flag.

The problem to solve is adequately defining "cannot hold". In the
strictest sense, you could say if %m = %n = 2^31 - 1 then the program
has UB; in effect defining "cannot hold" in the same way a location
you're loading from "cannot be" non-deferenceable. But, as David points
out, that would mean you cannot hoist arithmetic with the nuw/nsw tags
intact:

  if (foo)
   %t = add nuw X Y

since it could be that (X != 2^32-1 && Y != 2^32-1) only if foo ==
true. Arithmetic with no-wrap flags effectively are side-effecting
operations in this scheme.

The RFC tries to formalize a weaker notion of "cannot hold" that
justifies treating arithmetic like arithmetic. I'm trying to show
that the notion of poison value in this RFC is too weak; and allows
for certain programs to be well-defined (like the example I just
showed) which change meaning in the face of transforms we'd like to be
able to do.

-- Sanjoy

Typo:
   to `b > 0`
should be
  to `a + b > 0`
IIUC.

I mention in case you intend to publish this in the doc.

>
> Hello,
>
> What follows is my attempt to describe how poison works. Let me know
what you think.
>
> --
> David
>
>
> # LLVM Poison Semantics
>
> Poison is an LLVM concept which exists solely to enable further
optimization of LLVM IR. The exact behavior of poison has been, to say the
least, confusing for users, researchers and engineers working with LLVM.
>
> This document hopes to clear up some of the confusion of poison and
hopefully explain *why* it has its semantics.
>
> ## A Quick Introduction to Poison
>
> Let's start with a concrete motivating example in C:
> ```
> int isSumGreater(int a, int b) {
> return a + b > a;
> }
> ```
>
> The C specification permits us to optimize the comparison in
`isSumGreater` to `b > 0` because signed overflow results in undefined
behavior. A reasonable translation of `isSumGreater` to LLVM IR could be:

Typo:
   to `b > 0`
should be
  to `a + b > 0`
IIUC.

I mention in case you intend to publish this in the doc.

I intended what I wrote. We perform this optimization in InstCombine
today:
http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/InstCombine/InstCombineCompares.cpp?revision=226783&view=markup#l3193

Yeah sorry I was completely wrong.

Mehdi

> if the definition of NUW is that zext-ing the result is equivalent to
> zext-ing the inputs and doing the operation at a higher bitwidth (my
> understanding), then %m and %n cannot hold those values, that would
violate
> the NUW flag.

The problem to solve is adequately defining "cannot hold". In the
strictest sense, you could say if %m = %n = 2^31 - 1 then the program
has UB; in effect defining "cannot hold" in the same way a location
you're loading from "cannot be" non-deferenceable. But, as David points
out, that would mean you cannot hoist arithmetic with the nuw/nsw tags
intact:

  if (foo)
   %t = add nuw X Y

since it could be that (X != 2^32-1 && Y != 2^32-1) only if foo ==
true. Arithmetic with no-wrap flags effectively are side-effecting
operations in this scheme.

The RFC tries to formalize a weaker notion of "cannot hold" that
justifies treating arithmetic like arithmetic.

Yes, quite familiar.

  I'm trying to show
that the notion of poison value in this RFC is too weak; and allows
for certain programs to be well-defined (like the example I just
showed) which change meaning in the face of transforms we'd like to be
able to do.

I see, this is a case where the RFC as it stands strips poison too soon.

Going to ponder this, I feel like this should be something we can
incorporate into the model.

Let a = -10 and b = 1

a + b > a === -10 + 1 > -10 === -9 > -10 === true

b > 0 === 1 > 0 === true

All good.

Your version:

a + b > 0 === -10 + 1 > 0 === -9 > 0 === false

Small problem.

Could you maybe provide an example where replacing %always_poison with undef will change the meaning? At least for me, the thing that I’m most unclear about is how poison differs from undef.

– Sean Silva