Case where VSETCC DAGCombiner hack doesn't work

Ok, we were missing this specific case because of some instcombine xforms that were only applying to scalars, not vectors. I tweaked them to cover vectors and we're getting "perfect" code for this now (one cmpordps).

However, not all is sunshine and roses, there are some sad puppydog faces left. Specifically, things like this still get scalarized:

#include <emmintrin.h>
__m128i a(__m128 a, __m128 b, __m128 c) { return a==b & c==b; }

The problem is that the IR going into Codegen has been (nicely) simplified to:

define <2 x i64> @a(<4 x float> %a, <4 x float> %b, <4 x float> %c) nounwind readnone {
entry:
  %cmp = fcmp oeq <4 x float> %a, %b ; <<4 x i1>> [#uses=1]
  %cmp4 = fcmp oeq <4 x float> %c, %b ; <<4 x i1>> [#uses=1]
  %and6 = and <4 x i1> %cmp, %cmp4 ; <<4 x i1>> [#uses=1]
  %and = sext <4 x i1> %and6 to <4 x i32> ; <<4 x i32>> [#uses=1]
  %conv = bitcast <4 x i32> %and to <2 x i64> ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %conv
}

When legalize types sees the sext from <4 x i1> -> <4 x i32>, its only solution right now is to scalarize the whole mess feeding into it, giving us really atrocious code.

IMO, the solution to this is to have a legalize-types action for vectors that corresponds to "promote" on scalars. In this case, since X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a VSETCC node with a 4xi32 result, the and should vector promote to 4xi32, and the sext should vector promote as a vector sext_inreg.

I don't think that implementing this is particularly hard, but I have plenty of other things I'm working on right now. Is anyone else interested in working on this?

-Chris

Hi Chris,

When legalize types sees the sext from <4 x i1> -> <4 x i32>, its only solution right now is to scalarize the whole mess feeding into it, giving us really atrocious code.

IMO, the solution to this is to have a legalize-types action for vectors that corresponds to "promote" on scalars. In this case, since X86 supports VSETCC, the 4 x i1 SETCC should "vector promote" to a VSETCC node with a 4xi32 result, the and should vector promote to 4xi32, and the sext should vector promote as a vector sext_inreg.

I don't think that implementing this is particularly hard, but I have plenty of other things I'm working on right now. Is anyone else interested in working on this?

I agree that this should be straightforward: if the vector element
type is illegal (eg: i1), then legalize the element while keeping
it a vector (eg: <4 x i1> -> <4 x i8> or whatever, <4 x i128> ->
<8 x i64>). One question is whether type legalization should
handle the element type in the same way it would if it was a scalar,
eg should <4 x i1> get turned into <4 x i8>, since an i1 gets turned
into an i8, or into something else like <4 x i32>? I guess the first
option would be slightly simpler/more regular from the type legalization
viewpoint. Operation legalization could later turn <4 x i8> into
<4 x i32> if that's better for the operation. That said, I don't plan
to work on this.

Ciao,

Duncan.