I work with a riscv toolchain based on llvm 15.0.7, I want to generate (loop) vectorized code both in llvm-ir and assembly, using following sequence of commands:
clang -target riscv64-unknown-unknown -march=rv64idv -static -O3 -S -emit-llvm test.c -o test.ll
opt -loop-vectorize -force-vector-width=8 -S test.ll -o test.ll.bc
llc -march=riscv64 -mattr=+m,+f,+d,+a,+c,+v -O3 test.ll.bc -o test.s
output of “opt” is a vectorized intermediate code, like:
%wide.load2 = load <16 x i8>, ptr %11, align 1, !tbaa !4, !alias.scope !10
%12 = add <16 x i8> %wide.load2, %wide.load
but the machine code is not, such that the add (or load) above is mapped to 8 load-byte (“lb”) instructions ! I tried to play with llc flags (-march, -mcpu, -mtriple, -mattr), but could not get the vectorized machine code. I was wondering what am I missing here ? Sorry if this is a very basic question, but I could not find an answer anywhere.