RFC: Killing undef and spreading poison

Hi,

Over the past few years we've been trying to kill poison somehow. There have
been a few proposals, but they've all failed to pass the bar and/or to
gather significant support (my own proposals included).
We (David, Gil, John, Juneyoung, Sanjoy, Youngju, Yoonseung, and myself)
have a new proposal to kill undef instead and replace it with poison + a new
'freeze' instruction. We believe this proposal simplifies things at the IR
level, allows us to fix long-standing bugs in LLVM, and is roughly
performance-neutral for now (and can enable further optimizations in the
future).

Sorry for the longish email. We will give a talk about this proposal at the
LLVM dev meeting in November, so hopefully we'll be able to discuss this
further in person.

We would appreciate any comments, feedback, suggestions, etc. If you have
questions let me know as well (not everything below is super detailed, so
please ask where things are not explicit enough).
(I've CC'ed a few people that have been involved in discussions re semantics
of LLVM IR in the past year or so. Apologies in advance if I forgot someone
-- which probably I did. I've also CC'ed some CodeGen folks, from which we
would appreciate input on how this proposal fits within the current and the
future pipeline)

Thanks,
Nuno

From: "Nuno Lopes" <nuno.lopes@ist.utl.pt>
To: llvm-dev@lists.llvm.org
Cc: "Sanjoy Das" <sanjoy@playingwithpointers.com>, "David Majnemer" <david.majnemer@gmail.com>, "John Regehr"
<regehr@cs.utah.edu>, "Chung-Kil Hur" <gil.hur@sf.snu.ac.kr>, "Juneyoung Lee" <juneyoung.lee@sf.snu.ac.kr>,
"Yoonseung Kim" <yoonseung.kim@sf.snu.ac.kr>, "Youngju Song" <youngju.song@sf.snu.ac.kr>, hfinkel@anl.gov, "Dan
Gohman" <dan433584@gmail.com>, nlewycky@google.com, mbraun@apple.com, qcolombet@apple.com, "t p northover"
<t.p.northover@gmail.com>
Sent: Tuesday, October 18, 2016 7:06:31 AM
Subject: RFC: Killing undef and spreading poison

Hi,

Over the past few years we've been trying to kill poison somehow.
There have
been a few proposals, but they've all failed to pass the bar and/or
to
gather significant support (my own proposals included).
We (David, Gil, John, Juneyoung, Sanjoy, Youngju, Yoonseung, and
myself)
have a new proposal to kill undef instead and replace it with poison
+ a new
'freeze' instruction. We believe this proposal simplifies things at
the IR
level, allows us to fix long-standing bugs in LLVM, and is roughly
performance-neutral for now (and can enable further optimizations in
the
future).

Sorry for the longish email. We will give a talk about this proposal
at the
LLVM dev meeting in November, so hopefully we'll be able to discuss
this
further in person.

We would appreciate any comments, feedback, suggestions, etc. If you
have
questions let me know as well (not everything below is super
detailed, so
please ask where things are not explicit enough).
(I've CC'ed a few people that have been involved in discussions re
semantics
of LLVM IR in the past year or so. Apologies in advance if I forgot
someone
-- which probably I did. I've also CC'ed some CodeGen folks, from
which we
would appreciate input on how this proposal fits within the current
and the
future pipeline)

Thanks for working on this! A couple of questions/comments below:

...

Purpose of Freeze

Poison is propagated aggressively throughout. However, there are
cases where
this breaks certain optimizations, and therefore freeze is used to
selectively stop poison from being propagated.

A use of freeze is to enable speculative execution. For example,
loop
switching does the following transformation:
while (C) {
  if (C2) {
   A
  } else {
   B
  }
}
=>
if (C2) {
   while (C)
      A
} else {
    while (C)
       B
}

Here we are evaluating C2 before C. If the original loop never
executed
then we had never evaluated C2, while now we do. So we need to make
sure
there's no UB for branching on C2. Freeze ensures that so we would
actually
have 'if (freeze(C2))' instead.
Note that having branch on poison not trigger UB has its own
problems.

Can you please elaborate on these problems?

We
believe this is a good tradeoff.

...

Discussion on select

Select is a tricky instruction to define semantics for. There are
multiple
views of select's purpose that are contradictory:
1) Select should be equivalent to arithmetic (e.g., allow 'select
%c,
true, %x' -> 'or %c, %x' and arithmetic -> select)
2) br + phi -> select should be allowed (simplifyCFG)
3) Select -> br + phi should be allowed

To have 1), we need to make select poison if either of its arguments
is
poison. This disallows 2), since there we need to have select being
poison
only if the dynamically-chosen value is poison.
2) and 3) are orthogonal and do not conflict, though.

3) is hard because it requires select to be stronger than branching
(UB-wise), meaning that we would need select to be UB if the
condition was
poison. This blocks a bunch of instcombine patterns, like:
Pre: C < 0
%r = udiv %a, C
  =>
%c = icmp ult %a, C
%r = select %c, 0, 1

If %a is poison, then we would be introducing UB. Adding freeze(%c)
would
fix the problem, though.

Since we really want to have 2) since that's simplifyCFG, we propose
to make
'select %c, %a, %b' poison if any of the following holds:
- %c is poison
- %c = true and %a is poison
- %c = false and %b is poison

This disallows 3) and some transformations of 1). Since 3) is only
performed
in CodeGenPrepare, we can safely ignore it (no aggressive
optimizations are
run afterwards).

This strikes me as a dangerous game to play. CGP's transformations need to be semantically sound (even if they are anti-canonical). In part, this is because our IR-level analyses are used by CGP. Moreover, targets after CGP run IR-level transformations, and having different semantic rules there is going to make code reuse hard in subtle ways. Which brings up another issue: We currently have undef at the MI level; do you propose changing that, and if so, how?

-Hal

For 1), we will have to restrict InstCombine
patterns for
the cases when we are sure a given variable is non-poisonous (which
is only
when it came from a 'freeze' instruction or it's the result of a
non-poison-producing instruction).
This semantics also allows arithmetic -> select, but not select ->
arithmetic (without use of freeze).

...

A use of freeze is to enable speculative execution. For example, loop
switching does the following transformation:
while (C) {
  if (C2) {
   A
  } else {
   B
  }
}
=>
if (C2) {
   while (C)
      A
} else {
    while (C)
       B
}

Here we are evaluating C2 before C. If the original loop never
executed then we had never evaluated C2, while now we do. So we need
to make sure there's no UB for branching on C2. Freeze ensures that
so we would actually have 'if (freeze(C2))' instead.
Note that having branch on poison not trigger UB has its own problems.

Can you please elaborate on these problems?

Sure! For example, the following transformation would be wrong if branch on poison was a non-deterministic branch instead of UB:

    %a = add i32 %x, 1
    %c = icmp eq i32 %a, %poison
    br i1 %c, label %bb1, label %bb2
bb1:
    %b = add i32 %x, 1
    %d = call i32 @bar(i32 %b)
->
bb1:
    %d = call i32 @bar(i32 % poison)

GVN will perform this kind of transformation: it concludes that %a, %b, and %poison are all equal and picks one representative value. However, these values are equal only when they are not poison. If %poison is indeed poison, then the transformation is wrong because before it was passing an ok value to bar() and after the transformation it is passing poison.
On the other hand, if branching on poison is UB, the original program was executing UB already because %poison (and therefore %c) were poison. So the transformation is ok.

Discussion on select

Select is a tricky instruction to define semantics for. There are
multiple views of select's purpose that are contradictory:
1) Select should be equivalent to arithmetic (e.g., allow 'select
%c, true, %x' -> 'or %c, %x' and arithmetic -> select)
2) br + phi -> select should be allowed (simplifyCFG)
3) Select -> br + phi should be allowed

Since we really want to have 2) since that's simplifyCFG, we propose
to make 'select %c, %a, %b' poison if any of the following holds:
- %c is poison
- %c = true and %a is poison
- %c = false and %b is poison

This disallows 3) and some transformations of 1). Since 3) is only
performed in CodeGenPrepare, we can safely ignore it (no aggressive
optimizations are run afterwards).

This strikes me as a dangerous game to play. CGP's transformations need to be semantically sound (even if they are anti-canonical). In part, this is because our IR-level analyses are used by CGP. Moreover, targets after CGP run IR-level transformations, and having different semantic rules there is going to make code reuse hard in subtle ways. Which brings up another issue: We currently have undef at the MI level; do you propose changing that, and if so, how?

It is sound to do the transformation at IR level if you freeze the condition.
My suggestion was to avoid introducing overhead in CGP. This is the current state of affairs and it seems to work right now. I'm not shocked to see the IR changing the semantics throughout the pipeline, since while the compiler proceeds, the type of transformations done also change.
But I have to say that my understanding of LLVM after CGP is very limited. I was not aware of the code reuse from IR-level analyses. I agree this needs to be scrutinized carefully. Alternatively, we can just introduce freeze and pay the price (likely low anyway).

Regarding undef at MI level, we are not proposing any change. I don't see any reason right now to change it since it seems that optimizations at MI level are fine with just having undef. There's no nsw/nuw stuff there and undef seems very useful.
Poison is lowered into undef. And freeze is lowered into this pair of copy nodes that seems to stop propagation of undef (to ensure all uses see the same value). I don't know enough about MI to know if there's a better way to lower freeze, though.

Thanks,
Nuno

Ok, freeze() produces a fixed but undefined value, so that xor'ing the same
frozen value produces 0, ANDing a frozen value with 1 produces known zero
bits everywhere except the LSB, which is undefined.

Does every instance of freeze() produce a *different* fixed but undefined
value? i.e. the value produced by freeze() is labelled with a sequence
number or something like that?

Ok, freeze() produces a fixed but undefined value, so that xor'ing the same frozen value produces 0, ANDing a frozen value with 1 produces known zero bits everywhere except the LSB, which is undefined.

Does every instance of freeze() produce a *different* fixed but undefined value? i.e. the value produced by freeze() is labelled with a sequence number or something like that?

Freeze produces an arbitrary value. The compiler is free to pick a "good" one (one that enables some optimization) if it wishes so. There's no determination of these values.
It's similar to current undef, except that all uses see the same value.

Nuno

From: "Nuno Lopes" <nuno.lopes@ist.utl.pt>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Sanjoy Das" <sanjoy@playingwithpointers.com>, "David Majnemer" <david.majnemer@gmail.com>, "John Regehr"
<regehr@cs.utah.edu>, "Chung-Kil Hur" <gil.hur@sf.snu.ac.kr>, "Juneyoung Lee" <juneyoung.lee@sf.snu.ac.kr>,
"Yoonseung Kim" <yoonseung.kim@sf.snu.ac.kr>, "Youngju Song" <youngju.song@sf.snu.ac.kr>, "Dan Gohman"
<dan433584@gmail.com>, nlewycky@google.com, mbraun@apple.com, qcolombet@apple.com, "t p northover"
<t.p.northover@gmail.com>, llvm-dev@lists.llvm.org
Sent: Tuesday, October 18, 2016 10:10:41 AM
Subject: RE: RFC: Killing undef and spreading poison

>> A use of freeze is to enable speculative execution. For example,
>> loop
>> switching does the following transformation:
>> while (C) {
>> if (C2) {
>> A
>> } else {
>> B
>> }
>> }
>> =>
>> if (C2) {
>> while (C)
>> A
>> } else {
>> while (C)
>> B
>> }
>>
>> Here we are evaluating C2 before C. If the original loop never
>> executed then we had never evaluated C2, while now we do. So we
>> need
>> to make sure there's no UB for branching on C2. Freeze ensures
>> that
>> so we would actually have 'if (freeze(C2))' instead.
>> Note that having branch on poison not trigger UB has its own
>> problems.
>
> Can you please elaborate on these problems?

Sure! For example, the following transformation would be wrong if
branch on poison was a non-deterministic branch instead of UB:

    %a = add i32 %x, 1
    %c = icmp eq i32 %a, %poison
    br i1 %c, label %bb1, label %bb2
bb1:
    %b = add i32 %x, 1
    %d = call i32 @bar(i32 %b)
->
bb1:
    %d = call i32 @bar(i32 % poison)

GVN will perform this kind of transformation: it concludes that %a,
%b, and %poison are all equal and picks one representative value.
However, these values are equal only when they are not poison.

Okay; so the problem is that an instruction that is value-equivalent to a poison value is not itself necessarily poison?

If
%poison is indeed poison, then the transformation is wrong because
before it was passing an ok value to bar() and after the
transformation it is passing poison.
On the other hand, if branching on poison is UB, the original program
was executing UB already because %poison (and therefore %c) were
poison. So the transformation is ok.

>> Discussion on select
>> ====================
>> Select is a tricky instruction to define semantics for. There are
>> multiple views of select's purpose that are contradictory:
>> 1) Select should be equivalent to arithmetic (e.g., allow
>> 'select
>> %c, true, %x' -> 'or %c, %x' and arithmetic -> select)
>> 2) br + phi -> select should be allowed (simplifyCFG)
>> 3) Select -> br + phi should be allowed
>>
>> Since we really want to have 2) since that's simplifyCFG, we
>> propose
>> to make 'select %c, %a, %b' poison if any of the following holds:
>> - %c is poison
>> - %c = true and %a is poison
>> - %c = false and %b is poison
>>
>> This disallows 3) and some transformations of 1). Since 3) is only
>> performed in CodeGenPrepare, we can safely ignore it (no
>> aggressive
>> optimizations are run afterwards).
>
> This strikes me as a dangerous game to play. CGP's transformations
> need to be semantically sound (even if they are anti-canonical).
> In part, this is because our IR-level analyses are used by CGP.
> Moreover, targets after CGP run IR-level transformations, and
> having different semantic rules there is going to make code reuse
> hard in subtle ways. Which brings up another issue: We currently
> have undef at the MI level; do you propose changing that, and if
> so, how?

It is sound to do the transformation at IR level if you freeze the
condition.
My suggestion was to avoid introducing overhead in CGP. This is the
current state of affairs and it seems to work right now. I'm not
shocked to see the IR changing the semantics throughout the
pipeline, since while the compiler proceeds, the type of
transformations done also change.
But I have to say that my understanding of LLVM after CGP is very
limited. I was not aware of the code reuse from IR-level analyses.
I agree this needs to be scrutinized carefully. Alternatively, we
can just introduce freeze and pay the price (likely low anyway).

I think we'd need to do this to ensure consistency; I don't think that the price would be high, especially because we soon drop into different representations and the IR is used only for analysis (for pointer aliasing, etc.).

Regarding undef at MI level, we are not proposing any change. I
don't see any reason right now to change it since it seems that
optimizations at MI level are fine with just having undef. There's
no nsw/nuw stuff there and undef seems very useful.

We've already started propagating nsw/nuw into the SDAG, and as we move toward GlobalISel, we'll have them at the MI layer as well. I think we'll need to consider how to propagate this information to the MI layer directly.

-Hal

It seems like you're saying that an integer load which touches any uninitialized byte of memory results in poison. Therefore, load %a simplifies to "poison", and your proposed lowering of a bitfield write throws away the value of every other adjacent bitfield. Am I missing something?

-Eli

Right, a load touching a single uninitialized bit results in poison.
The trick is that on the first bitfield write, all the remaining untouched fields become initialized (with an arbitrary value, though). Essentially we are making the adjacent bitfields undef.

So if you have:
struct foo a;
a.x = 2;
a.y = 3;

IR becomes:
%a = alloca foo
%x = load %a
%x2 = freeze %v ; %x2 not poison
%x3 = bitmasking ; %x3 not poison
store %a, %x3

%y = load %a
%y2 = freeze %y ; not needed; %y is not poison
etc..

Nuno

>> Here we are evaluating C2 before C. If the original loop never
>> executed then we had never evaluated C2, while now we do. So we
>> need
>> to make sure there's no UB for branching on C2. Freeze ensures
>> that
>> so we would actually have 'if (freeze(C2))' instead.
>> Note that having branch on poison not trigger UB has its own
>> problems.
>
> Can you please elaborate on these problems?

Sure! For example, the following transformation would be wrong if
branch on poison was a non-deterministic branch instead of UB:

    %a = add i32 %x, 1
    %c = icmp eq i32 %a, %poison
    br i1 %c, label %bb1, label %bb2
bb1:
    %b = add i32 %x, 1
    %d = call i32 @bar(i32 %b)
->
bb1:
    %d = call i32 @bar(i32 % poison)

GVN will perform this kind of transformation: it concludes that %a,
%b, and %poison are all equal and picks one representative value.
However, these values are equal only when they are not poison.

Okay; so the problem is that an instruction that is value-equivalent to a poison value is not itself necessarily poison?

Right.
I think there were other examples, but I don't have them here handy. But at least this one is very problematic for GVN.

>> Discussion on select
>> ====================
>> Select is a tricky instruction to define semantics for. There are
>> multiple views of select's purpose that are contradictory:
>> 1) Select should be equivalent to arithmetic (e.g., allow
>> 'select
>> %c, true, %x' -> 'or %c, %x' and arithmetic -> select)
>> 2) br + phi -> select should be allowed (simplifyCFG)
>> 3) Select -> br + phi should be allowed
>>
>> Since we really want to have 2) since that's simplifyCFG, we
>> propose
>> to make 'select %c, %a, %b' poison if any of the following holds:
>> - %c is poison
>> - %c = true and %a is poison
>> - %c = false and %b is poison
>>
>> This disallows 3) and some transformations of 1). Since 3) is only
>> performed in CodeGenPrepare, we can safely ignore it (no
>> aggressive
>> optimizations are run afterwards).
>
> This strikes me as a dangerous game to play. CGP's transformations
> need to be semantically sound (even if they are anti-canonical).
> In part, this is because our IR-level analyses are used by CGP.
> Moreover, targets after CGP run IR-level transformations, and
> having different semantic rules there is going to make code reuse
> hard in subtle ways. Which brings up another issue: We currently
> have undef at the MI level; do you propose changing that, and if
> so, how?

It is sound to do the transformation at IR level if you freeze the
condition.
My suggestion was to avoid introducing overhead in CGP. This is the
current state of affairs and it seems to work right now. I'm not
shocked to see the IR changing the semantics throughout the
pipeline, since while the compiler proceeds, the type of
transformations done also change.
But I have to say that my understanding of LLVM after CGP is very
limited. I was not aware of the code reuse from IR-level analyses.
I agree this needs to be scrutinized carefully. Alternatively, we
can just introduce freeze and pay the price (likely low anyway).

I think we'd need to do this to ensure consistency; I don't think that the price would be high, especially because we soon drop into different representations and the IR is used only for analysis (for pointer aliasing, etc.).

Ok, makes sense; thanks!

Regarding undef at MI level, we are not proposing any change. I
don't see any reason right now to change it since it seems that
optimizations at MI level are fine with just having undef. There's
no nsw/nuw stuff there and undef seems very useful.

We've already started propagating nsw/nuw into the SDAG, and as we move toward GlobalISel, we'll have them at the MI layer as well. I think we'll need to consider how to propagate this information to the MI layer directly.

I see. Then it might make sense to introduce poison in MI then. But only if there's a plan to start doing optimizations that leverage nsw/nuw at MI level as well. Otherwise undef is just fine.
I guess we need to do this in phases, though, to avoid breaking everything at the same time :slight_smile:

Nuno

Hi,

Nuno Lopes wrote:

Okay; so the problem is that an instruction that is value-equivalent
to a poison value is not itself necessarily poison?

Right.
I think there were other examples, but I don't have them here handy. But
at least this one is very problematic for GVN.

Another example is this:

void f(int k) {
   if (k != 0) {
     for (< finite loop >) {
       if (always_false_at_runtime) {
         print(1 / k);
       }
     }
   }
}

We'd like to hoist the `1 / k` computation to the preheader. However, we can't do that today if `k` is undef, and we've defined branching on undef to be a non-deterministic choice.

I have some more details at: Redirecting…

-- Sanjoy

So "undef" still exists, except now it's obtainable via "freeze(poison)"?

-Krzysztof

Hi Krzysztof,

freeze(poison) is different from undef today, in the sense that it is an instruction that produces some random, but fixed bit pattern.

E.g. today in

   %x = undef
   %y = xor %x, %x

we can fold %y to undef since each use of %x can independently see some arbitrary (up to the compiler / environment) bit pattern.

But in the new proposal, in:

   %x = freeze(poison)
   %y = xor %x, %x

that is no longer allowed (%y _has_ to be 0) -- all uses of %x will see some garbage, but fixed bit pattern.

-- Sanjoy

What about this:
   %x = phi poison, poison (I'm simplifying the syntax here)
Can this be simplified to "%x = poison", i.e. can we rauw(%x, poison)?

Or
   %x = load %uninitialized_var
   %y = load %uninitialized_var
   // are %x and %y equal (i.e. is "cmp eq %x, %y" == true)?
   // is freeze(%x) equal to freeze(%y)?

I'm wary about such rules. I have a feeling that this is going to create its own set of problems.

-Krzysztof

Oh, okay, that works. I should have thought that through a bit more carefully.

Sort of related testcase:

struct S { int x : 24; };
S s = { 2 };

Gives:

@s = local_unnamed_addr global { i8, i8, i8, i8 } { i8 2, i8 0, i8 0, i8 undef }, align 4

When undef goes away, clang will need to be patched to generate zero here? More generally, will struct padding in globals be poison, or some arbitrary value?

A couple of related topics:

Instcombine currently transforms "memcpy(dst, src, 4)" to "load i32 src; store i32 dst". I guess that's illegal under this model? How about the related transform "memcpy(dst, src, 4)" to "load <4 x i8> src; store <4 x i8> dst"? (SROA also performs similar transforms.)

Have you given any thought to how auto-upgrade for bitcode files would work? I guess in general, you can upgrade an old "load i32" to "bitcast (freeze (load <4 x i8>)) to i32", but that's really awkward.

-Eli

I guess, it may help if we don't have a literal for poison (as we do for undef).

Then freeze(%x) would always be equal to freeze(%x) and there would be no question about freeze(poison).

-Krzysztof

Right, a load touching a single uninitialized bit results in poison.
The trick is that on the first bitfield write, all the remaining untouched fields become initialized (with an arbitrary value, though). Essentially we are making the adjacent bitfields undef.

So if you have:
struct foo a;
a.x = 2;
a.y = 3;

IR becomes:
%a = alloca foo
%x = load %a
%x2 = freeze %v ; %x2 not poison
%x3 = bitmasking ; %x3 not poison
store %a, %x3

%y = load %a
%y2 = freeze %y ; not needed; %y is not poison
etc..

Oh, okay, that works. I should have thought that through a bit more carefully.

I must say I got scared about bitfields a few weeks ago as well :slight_smile:

Sort of related testcase:

struct S { int x : 24; };
S s = { 2 };

Gives:

@s = local_unnamed_addr global { i8, i8, i8, i8 } { i8 2, i8 0, i8 0, i8 undef }, align 4

When undef goes away, clang will need to be patched to generate zero here? More generally, will struct padding in globals be poison, or some arbitrary value?

Padding can safely become poison.

Instcombine currently transforms "memcpy(dst, src, 4)" to "load i32 src; store i32 dst". I guess that's illegal under this model?How about the related transform "memcpy(dst, src, 4)" to "load <4 x i8> src; store <4 x i8> dst"? (SROA also performs similar transforms.)

The first is illegal, you're right (unless you know that they aren't poison, of course). The second version with vectors is correct.
The problem is that codegen for vector loads/stores doesn't seem optimal ATM. We've seen really bad cases (e.g., http://llvm.org/PR30615)

Have you given any thought to how auto-upgrade for bitcode files would work? I guess in general, you can upgrade an old "load i32" to "bitcast (freeze (load <4 x i8>)) to i32", but that's really awkward.

Ah, good point, I forgot about that. I was assuming auto-upgrade only needed to change undef into "freeze poison", but you're right that load semantics are also changing and would need patching as well.
It actually needs to be "load <32 x i1>" because undef is per bit and one bit being poison cannot taint the whole byte.
Of course we could have a few optimizations for common patterns such as (load (alloca)) -> (freeze (load (alloca)), but ATM I don't see any way around the pattern you suggest.

Thanks,
Nuno

But in the new proposal, in:

  %x = freeze(poison)
  %y = xor %x, %x

that is no longer allowed (%y _has_ to be 0) -- all uses of %x will see
some garbage, but fixed bit pattern.

What about this:
   %x = phi poison, poison (I'm simplifying the syntax here)
Can this be simplified to "%x = poison", i.e. can we rauw(%x, poison)?

Yes, that's ok.

   %x = load %uninitialized_var
   %y = load %uninitialized_var
   // are %x and %y equal (i.e. is "cmp eq %x, %y" == true)?
   // is freeze(%x) equal to freeze(%y)?

"icmp %x, %y" would be poison (so you can chose true or false if you wish). There's no change here with respect to current poison semantics.
freeze(%x) is not necessarily the same as freeze(%y). Even %a and %b might not be the same in "%a = freeze(%x), %b = freeze(%x)" (each freeze returns an arbitrary, but fixed, value).

Nuno

This one isn’t clear to me: you can fold 1/k -> 1 when k is undef. And then it is a constant which makes no point to hoist?

We need a literal like undef to e.g. initialize padding, use in SROA for phi node entries from branches where a variable is not initialized, constant folding, etc. We are just proposing to get rid of undef altogether and call it poison instead. It makes the compiler a bit more aggressive re undefined behavior, though.

If freeze(%x) is duplicated it can return a different value. Not having a poison literal doesn't change that.

Nuno

Okay; so the problem is that an instruction that is value-equivalent
to a poison value is not itself necessarily poison?

Right.
I think there were other examples, but I don't have them here handy. But
at least this one is very problematic for GVN.

Another example is this:

void f(int k) {
if (k != 0) {
   for (< finite loop >) {
     if (always_false_at_runtime) {
       print(1 / k);
     }
   }
}
}

We'd like to hoist the `1 / k` computation to the preheader. However, we can't do that today if `k` is undef, and we've defined branching on undef to be a non-deterministic choice.

This one isn’t clear to me: you can fold 1/k -> 1 when k is undef. And then it is a constant which makes no point to hoist?

Imagine that k is not a constant. Then you would like to hoist it to outside of the loop.
It seems safe to hoist it since we know k != 0 by the enclosing 'if'. And so we transform the function into:

if (k != 0) {
   int t = 1 / k;
   for (< finite loop >) {
      if (always_false_at_runtime) {
         print(t);
      }
   }
}

Now you realize that k is undef and you get:
if (<non-deterministic branch>) {
   int t = 1 / undef;
   for (< finite loop >) {
      if (always_false_at_runtime) {
         print(t);
      }
   }
}

We can know chose to change the if condition to true and the undef in the division to zero (remember each use of undef can yield a different result).
Now we have undefined behavior (UB) because we've introduced a division by zero (which would not happen in the original program).
If branching on poison was UB instead then the original program was already executing UB so the transformation was fine.

This example can be fixed by freezing k instead. That way we ensure that k is actually non-zero within the 'if' statement.

Nuno