Hi all,
I’m hitting a correctness issue when lowering an llvm.vp.fadd
to x86 with AVX512 support. It looks like the mask is dropped by the expandvp pass even though the intrinsic is masked:
*** IR Dump Before Expand vector predication intrinsics (expandvp) ***
define <16 x float> @test(<16 x float> %a, <16 x float> %b, <16 x i1> %mask) #0 {
%c = call <16 x float> @llvm.vp.fadd.v16f32(<16 x float> %a, <16 x float> %b, <16 x i1> %mask, i32 16)
ret <16 x float> %c
}
*** IR Dump After Expand vector predication intrinsics (expandvp) ***
define <16 x float> @test(<16 x float> %a, <16 x float> %b, <16 x i1> %mask) #0 {
%c1 = fadd <16 x float> %a, %b
ret <16 x float> %c1
}
llc test.ll -mcpu=cascadelake
I know that a vectorizer may decide to unmask non-side-effecting instructions when it’s safe to do so but in this particular case we are explicitly generating a masked intrinsic and the backend doesn’t seem to be honoring the mask semantics. Is this an expected behavior of VP intrinsics?
Thanks!
Diego