Hi,
I am getting an abort in llc with ‘main’ and LLVM 13.0 when I run clang at -O0 for the following test case:
#include <immintrin.h>
void small(float *b, __m512 c, __mmask16 mask)
{
__m512 a = _mm512_setzero_ps();
asm(“vfmadd231ps (%1), %2, %0 %{%3%}”: “+v”(a): “r”(b), “v”(c), “k” (mask));
_mm512_mask_storeu_ps(b, mask, a);
}
clang -S bug.c -march=skylake-avx512 -O0
bug.c:6:7: error: Register k0 can’t be used as write mask
asm(“vfmadd231ps (%1), %2, %0 %{%3%}”: “+v”(a): “r”(b), “v”(c), “k” (mask));
^
:1:36: note: instantiated into assembly here
vfmadd231ps (%rax), %zmm1, %zmm0 {%k0}
Works: clang -S bug.c -march=skylake-avx512 -O2 // chooses k1
The first observation I have is that there is no write of k0 on the fma instruction. So even if I wrote my test case to explicitly use k0, there is still the abort at -O0. This feels like a bug.
The second thing I have – is it required that I specify the mask register I want since k0 is “special”? Or is llc incorrectly choosing the k0 register in the first place?
Thanks!
Scott