[AVX512] Inconsistent mask types for intrinsics?

Hey guys,

There seems to be an inconsistency between mask operand types for the
AVX512 intrinsics.

The mask instruction intrinsics expect a v16i1 for the mask operands:

def int_x86_kadd_v16i1 : GCCBuiltin<"__builtin_ia32_kaddw">,
             Intrinsic<[llvm_v16i1_ty], [llvm_v16i1_ty, llvm_v16i1_ty],
                        [IntrNoMem]>;

But other intrinsics expect a i8/i16 as the mask operand:

def int_x86_avx512_gather_dps_mask_512 : >GCCBuiltin<"__builtin_ia32_mask_gatherdps512">,
         Intrinsic<[llvm_v16f32_ty], [llvm_v16f32_ty, llvm_i16_ty,
                    llvm_v16i32_ty, llvm_ptr_ty, llvm_i32_ty],
                   [IntrReadMem]>;

I expected a v8i1/v16i1 for the gather intrinsics. Is there a reason
for this type difference or is it just an oversight? This is the case
for the compare, gather, scatter, and blend intrinsics.

Also, please note that the current implementation is functionally
correct. During ISelLowering, a bitcast is inserted to produce the
vector type for some of these intrinsics.

-Cameron