Consider the following LLVM IR snippet. (also attached)
the value of %unify.phi is undefined if the loop body (while.body9) is
executed, otherwise it has some specific value (%c.1 is a defined value).
However when I execute this IR (using lli), the value of %c.2 is always the
value of %c.1, even if the loop is executed multiple times.
What is the reason for this behavior? how undefined values are handled in
LLVM and is this behaviour architecture dependent (depending on how undef
is handled in Codegen) ?
Undef means it can take any value (except poison). Therefore LLVM replaces phi(undef, %x) with %x, as it can fix the value of undef. This is optimization is common and it’s not architecture dependent.
This optimization is actually wrong in case %x is poison, as then we are replacing undef with poison for one of the predecessors (which is not correct). But LLVM is still inconsistent in the treatment of undef and we can’t fix it straight away.
I don’t know how lli picks the value for undef but I think optimizations eliminate the undef value in phi before the execution
because optimization is allowed to change undef to an arbitrary value[0], so undef can be replaced by %c.1.
For example, early-cse optimizes %c.2 to %c.2 = phi i32 [ %c.0, %while.end ], [ %c.1, %while.end12 ].
Can I depend on this behaviour? In the example, this %unify.phi is added by
one of my custom passes. Can I assume that if a loop is present the real
value of %c.2 will still be %c.1 regardless of loop's execution behavior.