Hi,
Thanks everybody that showed up in our talk at the LLVM dev meeting and to
those that provided feedback so far.
The slides are already online:
http://llvm.org/devmtg/2016-11/Slides/Lopes-LongLivePoison.pdf
The main question that some people raised was whether we could have bitwise
poison instead of value-wise poison, since that semantics seems to be more
natural as values continue to be just a bag of bits.
During the talk I didn't have a good answer to this question. We've now
studied this option more thoroughly. Apologies for the delay, but I think
now we have a good handle on the subject.
Ok, so bitwise poison is not a well-defined concept in itself; we still to
define the semantics of the individual operations themselves. We studied two
different options:
1) a bit in the output is poison if flipping any set of poison bits in the
input may yield different values for the output bit.
For example (bitwise notation):
ppp * 000 == 000 (since flipping any of the poison bits cannot yield a
result other than zero)
00p + 00p == 0pp
ppp << 2 == p00
icmp ugt ppp, uint_max == p (special case)
2) (thanks to Sanjoy): for an operation R = A op B, bit R[i] is poison if
bit i of 'A op undef' may yield 0 and 1 (same for 'undef op B').
This one unfortunately breaks some algebraic rewrites like: x * -1 -> 0
- x
I've implemented both semantics in Alive (branches: newsema2 and
bitwise-poison2, respectively) and then we run Alive over a large set of
optimizations to see which were ok and which became wrong.
Many optimizations became wrong with either of the bitwise semantics. But
the important ones are:
- We keep the problem of not being able to increase number of register uses
in an expression. For example, these are wrong: x << 1 -> x + x and x *
(1 + y) -> x + x*y (FMA-like transformation).
- It becomes hard to make SROA correct and keep all the shift optimizations
we have; one of these would be wrong. For example, InstCombine does the
following optimization:
(%x ashr exact C1) shl C2 -> %x shl (C2-C1)
If we assume that shift outputs zero for the bits it introduces when
shifting, then these kind of shift optimizations are all wrong (we would
need those bits to be poison). On the other hand, SROA combines multiple
values with bitwise arithmetic and therefore it requires shift to leave
shifted bits as zero. This is a contradiction, so we can't have both
(unless we use a ton of freeze instructions).
I think these two problems with bitwise poison semantics are a deal breaker.
This kind of semantics opens more problems than it solves.
BTW, besides shift transformations, 'div exact' suffer from similar
problems. E.g., this is wrong: (%X udiv exact %Y) * %Y -> %X.
So, my suggestion is to go for value-wise poison, but maybe with a few
tweaks from what I've presented (based on the feedback we received):
- Have 2 load instructions: one that does a value-wise access (if any bit
is poison in memory, the whole value becomes poison), and another version
that freezes each bit individually (which is just syntactic sugar for a
freeze(load <n x i1>) followed by a bitcast). The latter load is very close
to the load instruction we have today.
Adding this extra instruction has some benefits: it simplifies IR
auto-upgrade, and frontends can move to the new instruction incrementally.
Also, it seems that C++ codegen could use the freezing load in a few places.
On the negative side is that the freezing load is harder to move around and
therefore we would probably need a pass to convert these freezing-loads into
valuewise-poison-loads whenever possible.
It's true that this proposal is also increasing the abstraction level of the
LLVM IR a little bit by explicitly saying that values are *not* a bag of
bits. Therefore, low-level bit manipulation has to be done carefully.
IMHO this slight change is ok, and lower-level optimization can be left to
SDAG/MI, where it's safe to do them and it's probably where they belong.
LLVM IR is typed, and values need to be treated as having a type in its own
right. This is probably more of a change in mindset rather than in the LLVM
code itself. At least I couldn't find anything concrete in LLVM that would
break.
I hope this convinces people that bitwise poison is not a good way to go. I
propose we go forward with the value-wise poison with the modification I've
described above.
Please let us know your thoughts and/or if you have questions. It would be
great to finally close all the issues related with undef and poison that we
have today
Thanks,
Nuno