LangRef semantics for shufflevector with undef mask is incorrect

Hi,

This is a follow up on a discussion around shufflevector with undef mask in https://reviews.llvm.org/D70641 and https://bugs.llvm.org/show_bug.cgi?id=43958.

The current semantics of shufflevector in http://llvm.org/docs/LangRef.html#shufflevector-instruction states:
"If the shuffle mask is undef, the result vector is undef. If any element of the mask operand is undef, that element of the result is undef."

We found this semantics to be problematic. TL;DR: instructions can't detect if an operand is undef.
Long story:
Undef can be replaced with any value at any time. It's valid to replace undef with 0 or 1 or anything else.

A possible sequence of optimizations with sufflevector could be as follows:
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 undef, i32 0>
->
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 2, i32 0>
->
%v = <undef, %x[0]>

So this respects the semantics in LangRef: the mask is undef, so the resulting element is undef.

However, there's an alternative sequence of optimizations:
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 undef, i32 0>
->
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 1, i32 0>
->
%v = <%x[1], %x[0]>

So now it depends on what the value of %x[1] is. If it's poison, we obtain:
%v = <poison, %x[0]>

This result contradicts the semantics in LangRef, even though no individual transformation we did above is wrong.
In summary, an instruction cannot detect undef and have special semantics for it.

AFAICT, the only way to fix the semantics of shufflevector is to say that if the mask is out-of-bounds, we yield poison. That's correct because there's nothing worse than poison.
Since we can replace undef with an OOB index, an undef mask can safely yield poison. Or it would yield one of the input elements, which is poison in the worst case. So we get poison in both cases.

I guess the issue to make this semantics a reality is that we would need to introduce a poison value (which is a good thing IMHO). Otherwise we can't continue doing some of the folds we have today since we don't have a poison constant to replace undef when folding.

Any comments/suggestions appreciated!

Thanks,
Nuno

The shuffle mask of a shufflevector is special: it's required to be a constant in a specific form. From LangRef: "The shuffle mask operand is required to be a constant vector with either constant integer or undef values." So really, we can resolve this any way we want; "undef" in this context doesn't have to mean the same thing as "undef" in other contexts. Formally, at the LangRef level, we can state that the shuffle mask is not an operand of a shufflevector; instead, it's not a value at all. It's just a description of the shuffle, defined with a grammar similar to a vector constant. Then we can talk about shuffle masks where an element is the string "undef", unrelated to the general notion of an undef value.

In that context, the existing LangRef description of the result of shufflevector can be interpreted to mean exactly what it says. This would imply it's forbidden to transform the shuffle mask "<2 x i32> <i32 undef, i32 0>" to "<2 x i32> <i32 1, i32 0>" if the input might contain poison. (And this is the same logic that led to https://reviews.llvm.org/D70246 .)

That said, long-term, we probably want to switch shufflevector to produce poison.

-Eli

The shuffle mask of a shufflevector is special: it's required to be a constant in a specific form. From LangRef: "The shuffle mask operand is required to be a constant vector with either constant integer or undef values." So really, we can resolve this any way we want; "undef" in this context doesn't have to mean the same thing as "undef" in other contexts. Formally, at the LangRef level, we can state that the shuffle mask is not an operand of a shufflevector; instead, it's not a value at all. It's just a description of the shuffle, defined with a grammar similar to a vector constant. Then we can talk about shuffle masks where an element is the string "undef", unrelated to the general notion of an undef value.

That is something that has been on my mind for a while now. You can ask the same why we use 'undef' for phi nodes. Eg it is legal to turn this:

   %x = phi i32 [ 0, A ], [ undef, B ]

into

   %x = phi i32 [ 0, A ], [ 1, B ]

which arguing by the intended semantics of phi nodes should be an illegal transformation but isn't in LLVM.

I think that we abuse the 'undef' (symbol) to mute instruction parameters whenever that parameter doesn't matter but we are shy of 'some' value handle to feed the operand slot.

IMHO for those cases, we need a proper '\bot' constant that denotes the absence of a concrete value as opposed to 'undef' (conceptually '\top'), which could be any value you'd like it to be.

- Simon

In that context, the existing LangRef description of the result of shufflevector can be interpreted to mean exactly what it says. This would imply it's forbidden to transform the shuffle mask "<2 x i32> <i32 undef, i32 0>" to "<2 x i32> <i32 1, i32 0>" if the input might contain poison. (And this is the same logic that led to https://reviews.llvm.org/D70246 .)

That said, long-term, we probably want to switch shufflevector to produce poison.

-Eli

Quoting Simon Moll via llvm-dev <llvm-dev@lists.llvm.org>:

The shuffle mask of a shufflevector is special: it's required to be a constant in a specific form. From LangRef: "The shuffle mask operand is required to be a constant vector with either constant integer or undef values." So really, we can resolve this any way we want; "undef" in this context doesn't have to mean the same thing as "undef" in other contexts. Formally, at the LangRef level, we can state that the shuffle mask is not an operand of a shufflevector; instead, it's not a value at all. It's just a description of the shuffle, defined with a grammar similar to a vector constant. Then we can talk about shuffle masks where an element is the string "undef", unrelated to the general notion of an undef value.

That is something that has been on my mind for a while now. You can ask the same why we use 'undef' for phi nodes. Eg it is legal to turn this:

   %x = phi i32 [ 0, A ], [ undef, B ]

into

   %x = phi i32 [ 0, A ], [ 1, B ]

which arguing by the intended semantics of phi nodes should be an illegal transformation but isn't in LLVM.

I think that we abuse the 'undef' (symbol) to mute instruction parameters whenever that parameter doesn't matter but we are shy of 'some' value handle to feed the operand slot.

IMHO for those cases, we need a proper '\bot' constant that denotes the absence of a concrete value as opposed to 'undef' (conceptually '\top'), which could be any value you'd like it to be.

From a correctness perspective, it's fine to use undef in phi nodes. But I agree that for e.g. static analysis is not ideal, since as undef can take any value, any static analysis you do must return \top for a phi node with at least one undef input.
Switching to a poison value doesn't fix the issue either, since poison can still be refined by any value.

This discussion is a bit orthogonal to shufflevector, but consider this example:
%x = phi [\bottom, A], [%v, B]

If we know that %v \in [0, 3], I assume you want to conclude that %x \in [2, 3].
Now assume that %v is only computed in basic block B and not in A. When you generate assembly, which value do you use when jumping from A? It has to be a value that respects any static analysis you've done for %v, so it has to be 2 or 3, and you can't use %v.
It's non-trivial. I'm not seeing an easy solution to this precision problem. It's an interesting problem nevertheless.

Nuno

Ok, makes sense.
My suggestion is that we patch the IR Verifier to ensure that the mask is
indeed a vector of constants and/or undefs. Right now it only runs the
standard checks for instructions.
We will also run Alive2 on the test suite to make sure undef is never
replaced in practice.

Thanks,
Nuno

I believe ShuffleVectorInst::isValidOperands checks for the mask already. It’s called by Verifier::visitShuffleVectorInst. It’s also called by the LL Parser and the an assert in the shuffle vector constructor.

Ah, yes, thanks. I missed that function somehow.
Ok, so I think all is good then :slight_smile:

Thanks,
Nuno

Similar problems appears in other contexts as well:
  https://reviews.llvm.org/D70749#1761120

Might also be interesting to take a look

Quoting Simon Moll via llvm-dev <llvm-dev@lists.llvm.org>:

The shuffle mask of a shufflevector is special: it's required to be
a constant in a specific form. From LangRef: "The shuffle mask
operand is required to be a constant vector with either constant
integer or undef values." So really, we can resolve this any way we
want; "undef" in this context doesn't have to mean the same thing as
"undef" in other contexts. Formally, at the LangRef level, we can
state that the shuffle mask is not an operand of a shufflevector;
instead, it's not a value at all. It's just a description of the
shuffle, defined with a grammar similar to a vector constant. Then
we can talk about shuffle masks where an element is the string
"undef", unrelated to the general notion of an undef value.

That is something that has been on my mind for a while now. You can
ask the same why we use 'undef' for phi nodes. Eg it is legal to
turn this:

   %x = phi i32 [ 0, A ], [ undef, B ]

into

   %x = phi i32 [ 0, A ], [ 1, B ]

which arguing by the intended semantics of phi nodes should be an
illegal transformation but isn't in LLVM.

I think that we abuse the 'undef' (symbol) to mute instruction
parameters whenever that parameter doesn't matter but we are shy of
'some' value handle to feed the operand slot.

IMHO for those cases, we need a proper '\bot' constant that denotes
the absence of a concrete value as opposed to 'undef' (conceptually
'\top'), which could be any value you'd like it to be.

From a correctness perspective, it's fine to use undef in phi nodes.
But I agree that for e.g. static analysis is not ideal, since as undef
can take any value, any static analysis you do must return \top for a
phi node with at least one undef input.
Switching to a poison value doesn't fix the issue either, since poison
can still be refined by any value.

This discussion is a bit orthogonal to shufflevector, but consider
this example:
%x = phi [\bottom, A], [%v, B]

If we know that %v \in [0, 3], I assume you want to conclude that %x
\in [2, 3].
Now assume that %v is only computed in basic block B and not in A.
When you generate assembly, which value do you use when jumping from
A? It has to be a value that respects any static analysis you've done
for %v, so it has to be 2 or 3, and you can't use %v.

The difference is very subtle:

- '\bottom' you may select any value. It is guaranteed that the behavior
of the program is invariant in the value you chose (that's a contract
you sign into). If the program behavior changes due to different
realizations of '\bottom' the program is broken. There is not 'bottom'
value at runtime. Operations on bottom can be removed (replaced by
'bottom') because they do not happen/cannot have any semantic effect.

- 'undef' you may select any value. The choice of value may change
program behavior (whatever that means, btw..) but *all* resulting
behaviors are ok. 'undef' is also a runtime value. Operations on 'undef'
can yield any result you like because you can pick a value of your
choosing for 'undef' operands and "pretend-execute" the instruction.

Simon

Hi,

This is a follow up on a discussion around shufflevector with undef mask in ⚙ D70641 [LangRef] make per-element poison behavior explicit and 43958 – shuffle undef mask on vectors with poison elements.

The current semantics of shufflevector in LLVM Language Reference Manual — LLVM 18.0.0git documentation states:
"If the shuffle mask is undef, the result vector is undef. If any element of the mask operand is undef, that element of the result is undef."

We found this semantics to be problematic. TL;DR: instructions can't detect if an operand is undef.
Long story:
Undef can be replaced with any value at any time. It's valid to replace undef with 0 or 1 or anything else.

A possible sequence of optimizations with sufflevector could be as follows:
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 undef, i32 0>
->
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 2, i32 0>
->
%v = <undef, %x[0]>

So this respects the semantics in LangRef: the mask is undef, so the resulting element is undef.

However, there's an alternative sequence of optimizations:
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 undef, i32 0>
->
%v = shufflevector <2 x float> %x, <2 x float> undef, <2 x i32> <i32 1, i32 0>
->
%v = <%x[1], %x[0]>

So now it depends on what the value of %x[1] is. If it's poison, we obtain:
%v = <poison, %x[0]>

This result contradicts the semantics in LangRef, even though no individual transformation we did above is wrong.
In summary, an instruction cannot detect undef and have special semantics for it.

AFAICT, the only way to fix the semantics of shufflevector is to say that if the mask is out-of-bounds, we yield poison. That's correct because there's nothing worse than poison.
Since we can replace undef with an OOB index, an undef mask can safely yield poison. Or it would yield one of the input elements, which is poison in the worst case. So we get poison in both cases.

I guess the issue to make this semantics a reality is that we would need to introduce a poison value (which is a good thing IMHO). Otherwise we can't continue doing some of the folds we have today since we don't have a poison constant to replace undef when folding.

Completely separately from this discussion, I think we really need a poison value. We end up writing needless complicated tests to generate poison and potentially poison values. Being able to simply spell poison directly in IR would help.