I have a few questions about the new vector shuffle matching code in the
x86 .td files. It's a big improvement over the old system and provides
the context that code generation for AVX needs. This is great!
I'm asking because I'm having some trouble converting some AVX patterns
over to the new system. I'm getting this error from tblgen:
VyPERM2F128PDirrmi: (set:isVoid VR256:v4i64:$dst, (vector_shuffle:v4i64 VR256:v4i64:$src1, (ld:v4i64 addr:iPTR:$src2)<<P:Predicate_unindexedload>><<P:Predicate_load>><<P:Predicate_memop>>)<<P:Predicate_vperm2f128>><<X:SHUFFLE_get_vperm2f128_imm>>)
llvm/lib/Target/X86/X86InstrSIMD.td:1705:6: error: In VyPERM2F128PDirrmi: Cannot specify a transform function for a non-input value!
Here the tblgen pattern looks like this:
[(set VR256:$dst,
(v4i64 (vperm2f128:$src3 VR256:$src1,
(v4i64 (memop addr:$src2)))))],
and verpm2f128 is defined as:
def vperm2f128 : PatFrag<(ops node:$src1, node:$src2),
(vector_shuffle node:$src1, node:$src2), [{
return X86::isVPERM2F128Mask(cast<ShuffleVectorSDNode>(N));
}], SHUFFLE_get_vperm2f128_imm>;
I don't understand completely how the new system all works. Take a
simple SHUFPS match:
def SHUFPSrri : PSIi8<0xC6, MRMSrcReg,
(outs VR128:$dst), (ins VR128:$src1,
VR128:$src2, i8imm:$src3),
"shufps\t{$src3, $src2, $dst|$dst, $src2, $src3}",
[(set VR128:$dst,
(v4f32 (shufp:$src3 VR128:$src1, VR128:$src2)))]>;
"shufp" is the magic bit here. It's definition looks like this:
def shufp : PatFrag<(ops node:$lhs, node:$rhs),
(vector_shuffle node:$lhs, node:$rhs), [{
return X86::isSHUFPMask(cast<ShuffleVectorSDNode>(N));
}], SHUFFLE_get_shuf_imm>;
First off, why does the vector_shuffle pattern take only two operands?
I understand that the VECTOR_SHUFFLE node has three operands but
vector_shuffle is defined as:
def SDTVecShuffle : SDTypeProfile<1, 2, [
SDTCisSameAs<0, 1>, SDTCisSameAs<1, 2>
]>;
def vector_shuffle : SDNode<"ISD::VECTOR_SHUFFLE", SDTVecShuffle, >;
So the pattern match is against the two input vectors, excluding the
shuffle mask. Why is this?
In the SHUFPS above the shuffle mask is annotated into the shufp
operation as shufp:$src3. Is this done simply to "use up" the third
input so the matcher combines all three into the SHUFPS instruction,
appropriately transforming the mask with SHUFFLE_get_shuf_imm?
All of the shufp stuff looks very much like the vperm2f128 stuff but yet
shufp works and vperm2f128 doesn't.
Some vector_shuffle fragments seem to "ignore" the mask:
def UNPCKHPSrr : PSI<0x15, MRMSrcReg,
(outs VR128:$dst), (ins VR128:$src1, VR128:$src2),
"unpckhps\t{$src2, $dst|$dst, $src2}",
[(set VR128:$dst,
(v4f32 (unpckh VR128:$src1, VR128:$src2)))]>;
Here unpckh is defined as:
def unpckh : PatFrag<(ops node:$lhs, node:$rhs),
(vector_shuffle node:$lhs, node:$rhs), [{
return X86::isUNPCKHMask(cast<ShuffleVectorSDNode>(N));
}]>;
I assume we can ignore the vector shuffle mask because the UNPCKH
instruction doesn't take a control field - it's shuffle operation is
fixed. So how does the DAG matcher "know" to match a VECTOR_SHUFFLE
node with three operands?
Thanks!
-Dave