Hi,
Let me try to give a bit more context on why select is so tricky.
First thing to consider is which transformations we would like to support:
- Control-flow → select (SimplifyCFG)
if (c)
a = x
else
a = y
=>
%a = select %c, %x, %y
- select → control-flow; reverse of 1)
Not sure if this is done at IR level, or only later at SDAG.
- select → arithmetic
%a = select %c, true, %y
=>
%a = or %c, %y
- select removal
%c = icmp eq %x, C
%r = select %c, C, %x
=>
%r = %x
- select hoisting past binops
%a = udiv %x, %y
%b = udiv %x, %z
%r = select %c, %a, %b
=>
%t = select %c, %y, %z
%r = udiv %x, %t
- Bonus: easy to move select instructions around. Should be possible to hoist selects out of loops easily.
Ideally we want semantics for select that allow all transformations 1-6. It’s hard because:
-
%a can only be poison conditionally on %c, i.e., %a is poison if %c=true and %x is poison, or %c=false and %y is poison
-
since we introduce a branch on %c, select on poison has to be UB like branch on poison
-
with arithmetic all operands are always evaluated, so conditional poison like in 1) doesn’t work
-
the example provided replaces C with %x in case %x=C. C is never poison, but %x might be.
-
We cannot introduce a division by poison if %y and %z are not poison
-
Making select trigger UB for some cases makes movement harder because then we need to prove that it won’t trigger UB before e.g. hoisting it out of loops.
Summary table of what each transformation allows for %z = select %c, %x, %y. Each column is a different alternative of semantics for select:
|
UB if %c poison
+ conditional poison
|
UB if %c poison + poison if either
%x/%y poison
|
Conditional poison
+ non-det choice if %c poison
|
Conditional poison + poison if %c poison**
|
Poison if any of
%c/%x/%y are poison
|
- | - | - | - | - | - |
SimplifyCFG
|
✓
|
|
✓
|
✓
|
|
Select->control-flow
|
✓
|
✓
|
|
|
|
Select->arithmetic
|
|
✓
|
|
|
✓
|
Select removal
|
✓
|
✓
|
|
✓
|
✓
|
Select hoist
|
✓
|
✓
|
✓
|
|
|
Easy movement
|
|
|
✓
|
✓
|
✓
|
Modulo bugs in the table, you can see there’s no single column with all rows with a ✓. That means there’s no way (that I’m aware of) to make all transformations that we are interested in doing to be correct. A solution is to introduce something like the freeze instruction that can land a ✓ on any cell you want.
So unless someone has a clever idea and proposes a new column that has ticks in all rows, we are left with picking a trade-off: either we disable some optimizations, or we introduce something like freeze to continue doing them.
BTW, this table assumes that branch on poison is UB, otherwise optimizations like GVN are wrong (for more details see our paper: http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf). The column marked with ** is the one that Alive currently implements.
Regarding SDAG: it has undef, and I believe there was some discussion regarding introducing poison there as well. I don’t recall if it was introduce already, but I believe there’s already nsw/nuw there. If that’s the case, the (il)legality of transformations should be exactly the same as in LLVM IR. Otherwise some transformations may be easier.
Sorry for the longish email; just wanted to give some background about the problem so that we can reach some consensus at some point.
Nuno