Hi,
I've been playing around with the SLPVectorizer trying to get it to
vectorize this simple program:
define void @vector(i32 addrspace(1)* %out, i32 %index) {
entry:
%0 = alloca [4 x i32]
%x = getelementptr [4 x i32]* %0, i32 0, i32 0
%y = getelementptr [4 x i32]* %0, i32 0, i32 1
%z = getelementptr [4 x i32]* %0, i32 0, i32 2
%w = getelementptr [4 x i32]* %0, i32 0, i32 3
store i32 0, i32* %x
store i32 1, i32* %y
store i32 2, i32* %z
store i32 3, i32* %w
%1 = getelementptr [4 x i32]* %0, i32 0, i32 %index
%2 = load i32* %1
store i32 %2, i32 addrspace(1)* %out
ret void
}
My goal is to have this program transformed to the following:
define void @vector(i32 addrspace(1)* %out, i32 %index) {
entry:
%0 = extractelement <4 x i32> <i32 0, i32 1, i32 2, i32 3>, i32 %index
store i32 %0, i32 addrspace(1)* %out
}
I've slightly modified the SLPVectorizer (see the attached patch) so
that it will vectorize small trees, and I've also fixed a crash in the
BoUpSLP::Gather() function when it is passed a list of store
instructions. With this patch, the command:
opt -slp-vectorizer -debug -march=r600 -mcpu=redwood -o - vector-alloca.ll -S -slp-threshold=-20
Produces the following output and the program remains unchanged:
slp-vectorize-alloc.patch (1.09 KB)