As Sanjay noted in D31426, InstructionSimplify is missing the following simplification:
This function:
define <4 x i32> @splat_operand(<4 x i32> %x) {
%splat = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> zeroinitializer
%shuf = shufflevector <4 x i32> %splat, <4 x i32> undef, <4 x i32> <i32 0, i32 3, i32 2, i32 1>
ret <4 x i32> %shuf
}
can be simplified to:
define <4 x i32> @splat_operand(<4 x i32> %x) {
%shuf = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> zeroinitializer
ret <4 x i32> %shuf
}
InstCombine covers this case inefficiently.
I noticed that InstructionSimplify does not do any simplifications for shufflevector’s other than constant folding. I just wanted to be sure there is no compelling reason for this before I start streaming patches. I assume that this is not related to our conservative approach of refraining from creation of new shuffle masks that may hurt some target.
Here are some more opportunities that can be added to InstructionSimplify, all of which are covered by InstCombine:
define <4 x i32> @undef_mask(<4 x i32> %x) {
%shuf = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> undef
ret <4 x i32> %shuf
}
à
define <4 x i32> @undef_mask(<4 x i32> %x) {
ret <4 x i32> undef
}
define <4 x i32> @identity_mask_0(<4 x i32> %x) {
%shuf = shufflevector <4 x i32> %x, <4 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
ret <4 x i32> %shuf
}
à
define <4 x i32> @identity_mask_0(<4 x i32> %x) {
ret <4 x i32> %x
}
define <4 x i32> @identity_mask_1(<4 x i32> %x) {
%shuf = shufflevector <4 x i32> undef, <4 x i32> %x, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
ret <4 x i32> %shuf
}
à
define <4 x i32> @identity_mask_1(<4 x i32> %x) {
ret <4 x i32> %x
}
define <4 x i32> @pseudo_identity_mask(<4 x i32> %x) {
%shuf = shufflevector <4 x i32> %x, <4 x i32> %x, <4 x i32> <i32 0, i32 1, i32 2, i32 7>
ret <4 x i32> %shuf
}
à
define <4 x i32> @pseudo_identity_mask(<4 x i32> %x) {
ret <4 x i32> %x
}
define <4 x i32> @const_operand(<4 x i32> %x) {
%shuf = shufflevector <4 x i32> <i32 42, i32 43, i32 44, i32 45>, <4 x i32> %x, <4 x i32> <i32 0, i32 3, i32 2, i32 1>
ret <4 x i32> %shuf
}
à
define <4 x i32> @const_operand(<4 x i32> %x) {
ret <4 x i32> <i32 42, i32 45, i32 44, i32 43>
}
define <4 x i32> @merge(<4 x i32> %x) {
%lower = shufflevector <4 x i32> %x, <4 x i32> undef, <2 x i32> <i32 1, i32 0>
%upper = shufflevector <4 x i32> %x, <4 x i32> undef, <2 x i32> <i32 2, i32 3>
%merged = shufflevector <2 x i32> %upper, <2 x i32> %lower, <4 x i32> <i32 3, i32 2, i32 0, i32 1>
ret <4 x i32> %merged
}
à
define <4 x i32> @merge(<4 x i32> %x) {
ret <4 x i32> %x
}
Would appreciate your comments and feedback.
Thanks, Zvi