[poison] is select-of-select to logic+select allowed?

Some InstCombine transforms for select-of-select were added here:
https://reviews.llvm.org/rL228409

But Alive says this is more poisonous:

Name: selsel
%s1 = select i1 %cond1, i8 C1, i8 C2
%s2 = select i1 %cond2, i8 %s1, i8 C2
=>
%andcond = and i1 %cond1, %cond2
%s2 = select i1 %andcond, i8 C1, i8 C2

http://rise4fun.com/Alive/JT6

Are those transforms legal?

Nuno and I have been looking through the results of experiments like the one reported here:

   Translation Validation of Bounded Exhaustive Test Cases – Embedded in Academia

Indeed there are some select-related transformations in LLVM that are illegal in terms of Alive's semantics. As far as we know, they cannot be made legal (by adjusting the semantics of instructions) without breaking a bunch of other optimizations.

The optimizations currently in LLVM are not only inconsistent but they can lead to end-to-end miscompilations. Sorting this out is going to require (1) figuring out what select means WRT poison/undef and (2) backing out some optimizations. This is going to be slightly painful but it is the only way forward.

John

Are we even at a point in the poison semantics discussion where we can reason about it? Is there a canonical description of poison at the moment which is in sync with LLVM (and alive)? The LLVM language reference doesn’t even seem to mention the “poison filtering” effect of select; i.e. when the non-poison input of a select is taken no poison comes out of it. While such a semantic surely makes sense IMO, it only states that about Phi instructions at the moment…

Sorry for not being too helpful here and just answering with more questions.

  • I assume matching for select (i1, i8 freeze, i8 freeze) like patterns would make this correct and as usefull again. But freeze is not part of official LLVM AFAIK.

  • If this rule is considered a problem today, then we could probably move it to the SelectionDAG level.

  • Matthias

I should remind that what we really want to do (at the same time as items 1 and 2 below) is to merge poison and undef into a single concept, along with adding a new freeze instruction that gives the benefits of both without (we hope) the current level of confusion.

We have patches to this effect and we need help moving towards a consensus that this approach (or some different approach) is the way forward.

https://reviews.llvm.org/D29011
https://reviews.llvm.org/D29121
https://reviews.llvm.org/D29013

John

For background, the reason I was looking at this is because we miss the transform when the select operands are swapped:

Name: selsel
%s1 = select i1 %cond1, i8 %a, i8 %b
%s2 = select i1 %cond2, i8 %s1, i8 %b
=>
%notcond1 = xor i1 %cond1, -1
%s1 = select i1 %notcond1, i8 %b, i8 %a
%s2 = select i1 %cond2, i8 %s1, i8 %b

If the transform to and/or is allowed, I’d enhance the matching to account for this case (if we can invert a condition for free).

I was also wondering about the DAG rules. We don’t (need to) consider poison at that level?

Hi,

Let me try to give a bit more context on why select is so tricky.

First thing to consider is which transformations we would like to support:

  1. Control-flow → select (SimplifyCFG)

if (c)

a = x

else

a = y

=>

%a = select %c, %x, %y

  1. select → control-flow; reverse of 1)

Not sure if this is done at IR level, or only later at SDAG.

  1. select → arithmetic

%a = select %c, true, %y

=>

%a = or %c, %y

  1. select removal

%c = icmp eq %x, C

%r = select %c, C, %x

=>

%r = %x

  1. select hoisting past binops

%a = udiv %x, %y

%b = udiv %x, %z

%r = select %c, %a, %b

=>

%t = select %c, %y, %z

%r = udiv %x, %t

  1. Bonus: easy to move select instructions around. Should be possible to hoist selects out of loops easily.

Ideally we want semantics for select that allow all transformations 1-6. It’s hard because:

  1. %a can only be poison conditionally on %c, i.e., %a is poison if %c=true and %x is poison, or %c=false and %y is poison

  2. since we introduce a branch on %c, select on poison has to be UB like branch on poison

  3. with arithmetic all operands are always evaluated, so conditional poison like in 1) doesn’t work

  4. the example provided replaces C with %x in case %x=C. C is never poison, but %x might be.

  5. We cannot introduce a division by poison if %y and %z are not poison

  6. Making select trigger UB for some cases makes movement harder because then we need to prove that it won’t trigger UB before e.g. hoisting it out of loops.

Summary table of what each transformation allows for %z = select %c, %x, %y. Each column is a different alternative of semantics for select:





|

UB if %c poison



+ conditional poison

|

UB if %c poison + poison if either
%x/%y poison

|

Conditional poison



+ non-det choice if %c poison

|

Conditional poison + poison if %c poison**

|

Poison if any of
%c/%x/%y are poison

|

  • | - | - | - | - | - |


    SimplifyCFG

    |



    |



    |



    |



    |



    |


    Select->control-flow

    |



    |



    |



    |



    |



    |


    Select->arithmetic

    |



    |



    |



    |



    |



    |


    Select removal

    |



    |



    |



    |



    |



    |


    Select hoist

    |



    |



    |



    |



    |



    |


    Easy movement

    |



    |



    |



    |



    |



    |

Modulo bugs in the table, you can see there’s no single column with all rows with a ✓. That means there’s no way (that I’m aware of) to make all transformations that we are interested in doing to be correct. A solution is to introduce something like the freeze instruction that can land a ✓ on any cell you want.

So unless someone has a clever idea and proposes a new column that has ticks in all rows, we are left with picking a trade-off: either we disable some optimizations, or we introduce something like freeze to continue doing them.

BTW, this table assumes that branch on poison is UB, otherwise optimizations like GVN are wrong (for more details see our paper: http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf). The column marked with ** is the one that Alive currently implements.

Regarding SDAG: it has undef, and I believe there was some discussion regarding introducing poison there as well. I don’t recall if it was introduce already, but I believe there’s already nsw/nuw there. If that’s the case, the (il)legality of transformations should be exactly the same as in LLVM IR. Otherwise some transformations may be easier.

Sorry for the longish email; just wanted to give some background about the problem so that we can reach some consensus at some point.

Nuno

The examples and table are great! That at least makes it clear what we need to think about and fix (and just how complex the problem is).

Are there known limitations/objections to the freeze instruction patches that John listed? Or are we just being cautious and/or waiting for more people to review those?

I can answer some of the DAG questions:
(a) Yes, we have nsw/nuw/exact down there. They’ve been around since: https://reviews.llvm.org/rL210467 .

(b) Yes, we use those bits to enable transforms, but it’s not nearly as common as in IR. It’s also likely that we’ll drop those bits during transforms.

(c) Yes, we do all kinds of select transforms without regard for poison.

(d) AFAIK, there is no accounting for poison in the DAG for any transforms, but it’s possible that I just didn’t find it in the source.

Regarding the patches, there are two concerns AFAICT:

  1. It’s a new instruction and as usual when introducing a new instruction it will require work for some time until most optimizations know about it, and to get rid of any potential perf regression. No big deal; we just need to do the work (and we have already done some of it).
  2. The patch was written by a student, which may not have time to support it properly going forward. That means that someone needs to step in and take ownership. Of course we will be here to help, but none of us at the moment can commit to own this. The blog post that John mentioned below describes our machinery to find bugs in optimizations and in new semantics, and we will put it to use to test LLVM periodically.

On the plus side, freeze can helps us get rid of a bunch of miscompilations, which no one enjoys.

The usage of freeze can be incremental; we can try out with a few things to see how it goes. We have more radical ideas down the road, like replace undef with poison, but one thing at a time :slight_smile:

Regarding SDAG, and given that poison is already there, we would need to adopt a similar solution to the IR. Maybe right now we can get away with it because nsw is not exploited significantly (as you say). Just because there’s no explicit poison in SDAG, just having nsw is sufficient to cause miscompilations when combined with other transformations.

See, for example, this bug report for InstCombine regarding select+nsw: https://bugs.llvm.org/show_bug.cgi?id=31633

Nuno

Would it make sense to first commit to a semantics for select, document this, and attempt to comment out all now-invalid optimizations? We don't want to remove them entirely since, as Nuno points out, we can bring some of them back later using appropriate freezes.

What I don't know is who would be affected by the performance regressions in the meantime, but I suspect that they wouldn't be too big of a deal for most of us.

John

As somebody who has been bitten by `undef` funny behaviour several
times (most recently in NewGVN, see, e.g.
https://bugs.llvm.org/show_bug.cgi?id=33129), I'm in favour of this.
My anecdotal evidence/past experience is that it causes too much pain
for relatively little benefit.
In fact, we're (likely) going to disable aggressive `undef`
optimizations, at least for PHIs, in NewGVN, at least until `undef` is
fixed.
For `select` in particular, it doesn't seem to be a big problem, but I
think some benchmarking is needed to make sure things don't regress
badly.

Hi,

OK then: let's add a -conservative-select option, initially off by default.

We still have to pick a semantics. Nuno's table is by far the best way to visualize the tradeoffs that I've seen.

John

4) select removal
%c = icmp eq %x, C
%r = select %c, C, %x
  =>
%r = %x
<snip>
4) the example provided replaces C with %x in case %x=C. C is never poison, but %x might be.
<snip>
Summary table of what each transformation allows for %z = select %c, %x, %y. Each column is a different alternative of semantics for select:

UB if %c poison
+ conditional poison

UB if %c poison + poison if either
%x/%y poison

Conditional poison
+ non-det choice if %c poison

Conditional poison + poison if %c poison**

Poison if any of
%c/%x/%y are poison

SimplifyCFG

Select->control-flow

Select->arithmetic

Select removal

Select hoist

Easy movement

Why doesn’t select removal work with the "Conditional poison + non-det choice if %c poison" case?

If %x is not poison, then %r is always equal to %x. If %x is poison, then %c is poison and %r is a non-deterministic choice between C (not poison) and %x (poison). So, I would think we can still replace %r with %x because poison is one of its possible behaviors.

Thanks David for pointing out a mistake in the table! I implemented most of the semantics shown in the table in Alive to test these things, but the bug still slipped through, sorry…

I’ve updated the table:

Summary table of what each transformation allows for %z = select %c, %x, %y. Each column is a different alternative of semantics for select:







|


UB if %c poison




+ conditional poison


|


UB if %c poison + poison if either
%x/%y poison


|


Conditional poison




+ non-det choice if %c poison


|


Conditional poison + poison if %c poison**


|


Poison if any of
%c/%x/%y are poison


|

  • | - | - | - | - | - |



    SimplifyCFG


    |



    |



    |



    |



    |



    |



    Select->control-flow


    |



    |



    |



    |



    |



    |



    Select->arithmetic


    |



    |



    |



    |

    partially

    |



    |



    Select removal


    |



    |



    |



    |



    |



    |



    Select hoist


    |



    |



    |



    |



    |



    |



    Easy movement


    |



    |



    |



    |



    |



    |

IMHO, the 3rd and 4th options are the ones that work best. Instructions with UB are usually a pain (for e.g. hoisting out of loops).

An advantage of the 4th option is that can partially do select->arithmetic, while the 3rd can’t. For example, this is valid with the 4th option:

%13 = mul nuw i2 %0, -1

%14 = srem i2 %13, -1

%15 = select i1 %1, i2 1, i2 %14

=>

%15 = zext i1 %1 to i2

The 4th option, however, can’t do the select->and/or transformations (neither the 3rd can).

That said, I’m inclined to choose the 4th option (marked with ** in the table). That’s the one that the online version of Alive implements BTW.

Nuno

Regarding SDAG, and given that poison is already there, we would need to adopt a similar solution to the IR. Maybe right now we can get away with it because nsw is not exploited significantly (as you say). Just because there’s no explicit poison in SDAG, just having nsw is sufficient to cause miscompilations when combined with other transformations.
See, for example, this bug report for InstCombine regarding select+nsw: 31633 – InstCombine can't fold "select %c, undef, %foo" to %foo (miscompile)

Poison/Undef at the CodeGen level is a very interesting discussion! I don't think there is any documentation about posion/undef at the CodeGen level and I haven't discussed this much with others, so I'd like to hear further feedback:

- I think we should not introduce a notion of poison (which I would call "delayed UB") at the SelectionDAG/CodeGen level[1]
- Instructions either produce UB immediately, so while there are nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can only assume that they produce a target specific value on overflow, but not arbitrary behavior. Instructions that can produce UB should marked "hasSideEffect" and code motion around it be limited.
- Typical optimization scenarios like establishing loop trip count bounds for which poison/UB is helpful should not matter for CodeGen.
- I don't have any evidence that optimizations in CodeGen require a model of poison to work. If someone can given me a counter example then I'd be hard pressed to disable the optimization in MI and push it towards the IR level.

- Matthias

You mean like every integer division? Having arbitrary side effects seems too strong. I’m not sure it is always possible to push these optimizations to the IR level, at least not without adding more Value* ties to instructions (e.g. what we do with MMOs). I have some experience, for example, taking optimizations taking advantage of hardware-counter-based loops and moving into the IR level (so that we can take advantage of ScalarEvolution), and it works to some extent, but is also fairly hacky (the legality-checking part of lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand what legalization will later do - it’s the best we can do right now, but it’s a mess). If we had better loop analysis at the MI level, this would be much better. We already have some interesting MI-level passes that do interesting things with loops (lib/CodeGen/MachinePipeliner.cpp, for example). I think that we should have the same semantics here on both the IR level and the MI level (whatever they are). Having different semantics in this regard is going to be confusing (and lead to subtle bugs because optimizations valid in one part of the pipeline will be invalid in other parts). These issues often come up around speculative execution, for example, and I definitely think we need to be able to deal with speculative execution at the MI level (because speculation opportunities often come from legalization/isel and pseudo-instruction expansion and because the modeling is better at the MI level). -Hal

You mean like every integer division? Having arbitrary side effects seems too strong.

If we can model it more precise sure. But in principle if you divide by zero many CPUs will trap immediately.

  • Typical optimization scenarios like establishing loop trip count bounds for which poison/UB is helpful should not matter for CodeGen.
  • I don’t have any evidence that optimizations in CodeGen require a model of poison to work. If someone can given me a counter example then I’d be hard pressed to disable the optimization in MI and push it towards the IR level.

I’m not sure it is always possible to push these optimizations to the IR level, at least not without adding more Value* ties to instructions (e.g. what we do with MMOs). I have some experience, for example, taking optimizations taking advantage of hardware-counter-based loops and moving into the IR level (so that we can take advantage of ScalarEvolution), and it works to some extent, but is also fairly hacky (the legality-checking part of lib/Target/PowerPC/PPCCTRLoops.cpp, for example, needs to understand what legalization will later do - it’s the best we can do right now, but it’s a mess). If we had better loop analysis at the MI level, this would be much better. We already have some interesting MI-level passes that do interesting things with loops (lib/CodeGen/MachinePipeliner.cpp, for example).

I think that we should have the same semantics here on both the IR level and the MI level (whatever they are). Having different semantics in this regard is going to be confusing (and lead to subtle bugs because optimizations valid in one part of the pipeline will be invalid in other parts). These issues often come up around speculative execution, for example, and I definitely think we need to be able to deal with speculative execution at the MI level (because speculation opportunities often come from legalization/isel and pseudo-instruction expansion and because the modeling is better at the MI level).

Does that mean a vreg (or even a physreg) can conceptually hold a poison value? Given that failed to capture the semantics at the IR level I have bad feelings about getting this right at the MI level with all the additional machine specific semantics. Reasoning about optimisations definitely gets easier without poison and I’d really like to see some good example first to motivate the maintenance burden.

A notion of unknown/unspecified value should be enough to enable hoisting, we shouldn’t need poison for that.

  • Matthias

Hi,

Does that mean a vreg (or even a physreg) can conceptually hold a poison
value? Given that failed to capture the semantics at the IR level I have bad
feelings about getting this right at the MI level with all the additional
machine specific semantics. Reasoning about optimisations definitely gets
easier without poison and I'd really like to see some good example first to
motivate the maintenance burden.

A notion of unknown/unspecified value should be enough to enable hoisting,
we shouldn't need poison for that.

The definition of poison we used in "Taming Undefined Behavior in
LLVM" has a natural expression of this -- we can say that in MI all
poison is "frozen on arrival". However, this disallows sinking:

  %RAX = gen_frozen_poison
L:
  use(%RAX)
  jmp L

will not be the same as:

L:
  %RAX = gen_frozen_poison
  use(%RAX)
  jmp L

Thanks,
-- Sanjoy

We don’t want to limit our ability to move memory references around integer division (for scheduling, etc.). I think that we should get this right once, and when we do, we should use these self-consistent semantics everywhere. I don’t see why machine-specific semantics make this harder or easier in general (there are certainly instructions that won’t have UB even when their corresponding IR-level counterparts do, which might make things easier, and instructions can have more dependencies to track, which might make things harder). -Hal

Regarding SDAG, and given that poison is already there, we would need to adopt a similar solution to the IR. Maybe right now we can get away with it because nsw is not exploited significantly (as you say). Just because there’s no explicit poison in SDAG, just having nsw is sufficient to cause miscompilations when combined with other transformations.

See, for example, this bug report for InstCombine regarding select+nsw: 31633 – InstCombine can't fold "select %c, undef, %foo" to %foo (miscompile)

Poison/Undef at the CodeGen level is a very interesting discussion! I don't think there is any documentation about posion/undef at the CodeGen level and I haven't discussed this much with others, so I'd like to hear further feedback:

- I think we should not introduce a notion of poison (which I would call "delayed UB") at the SelectionDAG/CodeGen level[1]
- Instructions either produce UB immediately, so while there are nsw/nuw flags on SelectionDAG arithmetic operatiosn I think we can only assume that they produce a target specific value on overflow, but not arbitrary behavior. Instructions that can produce UB should marked "hasSideEffect" and code motion around it be limited.
- Typical optimization scenarios like establishing loop trip count bounds for which poison/UB is helpful should not matter for CodeGen.
- I don't have any evidence that optimizations in CodeGen require a model of poison to work. If someone can given me a counter example then I'd be hard pressed to disable the optimization in MI and push it towards the IR level.

I'm not convinced that you need poison at SDAG/MI level either. However, you already have it :slight_smile:
For NSW/NUW to work, undef is not enough: you need poison. So right now there's already poison in SDAG; it's just that it is probably not exploited much more beyond folding "a+1 > a" into true or something like that. But poison is there for sure.
Then there's undef. It seems very useful in CodeGen; I just don't know if it could be replaced with poison altogether or not.

I haven't studied CodeGen as much as the IR, so I can't make any statement on what's the best way to go. To do so, we need to have a list of transformations/optimizations that are currently done at SDAG+MI levels to figure out what's the best semantics. I'm happy to help with defining the semantics.

Nuno