Sorry if this is a dumb or FAQ or the wrong list!
I'm currently investigating LLVM vectorization of my generated code. My codegen emits a lot of recursions that step through arrays via pointers. The recursions are nicely optimized into loops, but the loop vectorization can't seem to work on them because of phi nodes that point to gep nodes.
Some simple IR to demonstrate; it vectorizes nicely with opt -O3 -vectorize-loops -force-vector-width until I uncomment the phi/gep nodes.
define void @add_vector(float* noalias %a, float* noalias %b, float* noalias %c, i32 %num)
br label %Loop
%i = phi i32 [0,%Top],[%i.next,%Loop]
; phi and gep - won't vectorize
; %a.ptr = phi float* [%a,%Top],[%a.next,%Loop]
; %b.ptr = phi float* [%b,%Top],[%b.next,%Loop]
; %c.ptr = phi float* [%c,%Top],[%c.next,%Loop]
; %a.next = getelementptr float* %a.ptr, i32 1
; %b.next = getelementptr float* %b.ptr, i32 1
; %c.next = getelementptr float* %c.ptr, i32 1
; induction variable as index - will vectorize
%a.ptr = getelementptr float* %a, i32 %i
%b.ptr = getelementptr float* %b, i32 %i
%c.ptr = getelementptr float* %c, i32 %i
%a.val = load float* %a.ptr
%b.val = load float* %b.ptr
%sum = fadd float %a.val, %b.val
store float %sum, float* %c.ptr
%i.next = add i32 %i, 1
%more = icmp slt i32 %i.next, %num
br i1 %more, label %Loop, label %End
So it seems that the loop vectorizer would like the pointer stepping to be converted to base+index. However as expected, clang doesn't care whether C code is written as pointer arithmetic or table index.
Is there a pass that converts simple pointer arithmetic to base+index? If not, should I write one (shouldn't be too hard for my limited use case) or try to emit more vector-friendly code from the front end?
Thanks a bunch!