Loop vectorizer dosen't find loop bounds

I am trying to vectorize the function

void bar(float *c, float *a, float *b)
{
   const int width = 256;
   for (int i = 0 ; i < 256 ; ++i ) {
     c[ i ] = a[ i ] + b[ i ];
     c[ width + i ] = a[ width + i ] + b[ width + i ];
   }
}

using the following commands

clang -emit-llvm -S loop.c
opt loop.ll -O3 -debug-only=loop-vectorize -S -o -

LV: Checking a loop in "bar"
LV: Found a loop: for.body
LV: Found an induction variable.
LV: Found an unidentified write ptr: float* %c
LV: Found an unidentified read ptr: float* %a
LV: Found an unidentified read ptr: float* %b
LV: Found an unidentified read ptr: float* %a
LV: Found an unidentified read ptr: float* %b
LV: Found a runtime check ptr: %arrayidx4 = getelementptr inbounds float* %c, i64 %indvars.iv
LV: Found a runtime check ptr: %arrayidx14 = getelementptr inbounds float* %c, i64 %2
LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds float* %a, i64 %indvars.iv
LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds float* %b, i64 %indvars.iv
LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds float* %a, i64 %2
LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds float* %b, i64 %2
LV: We need to do 10 pointer comparisons.
LV: We can't vectorize because we can't find the array bounds.
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.

Is there any chance to make this work?

Frank

I am trying to vectorize the function

void bar(float *c, float *a, float *b)
{
   const int width = 256;
   for (int i = 0 ; i < 256 ; ++i ) {
     c[ i ] = a[ i ] + b[ i ];
     c[ width + i ] = a[ width + i ] + b[ width + i ];
   }
}

using the following commands

clang -emit-llvm -S loop.c
opt loop.ll -O3 -debug-only=loop-vectorize -S -o -

LV: Checking a loop in "bar"
LV: Found a loop: for.body
LV: Found an induction variable.
LV: Found an unidentified write ptr: float* %c
LV: Found an unidentified write ptr: float* %c
LV: Found an unidentified read ptr: float* %a
LV: Found an unidentified read ptr: float* %b
LV: Found an unidentified read ptr: float* %a
LV: Found an unidentified read ptr: float* %b
LV: Found a runtime check ptr: %arrayidx4 = getelementptr inbounds
float* %c, i64 %indvars.iv
LV: Found a runtime check ptr: %arrayidx14 = getelementptr inbounds
float* %c, i64 %2
LV: Found a runtime check ptr: %arrayidx = getelementptr inbounds
float* %a, i64 %indvars.iv
LV: Found a runtime check ptr: %arrayidx2 = getelementptr inbounds
float* %b, i64 %indvars.iv
LV: Found a runtime check ptr: %arrayidx7 = getelementptr inbounds
float* %a, i64 %2
LV: Found a runtime check ptr: %arrayidx10 = getelementptr inbounds
float* %b, i64 %2
LV: We need to do 10 pointer comparisons.
LV: We can't vectorize because we can't find the array bounds.
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.

Is there any chance to make this work?

Try adding the restrict keyword to the function parameters:

void bar(float * restrict c, float * restrict a, float * restrict b)

-Hal

Bingo! That works (when coming from C source)

Now, I have a serious problem. I am not coming from C but I build the function with the builder. I am also forced to change the signature and load the pointers a,b,c afterwards:

define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 {
entrypoint:
   %0 = bitcast [8 x i8]* %arg_ptr to i32*
   %1 = load i32* %0, align 4
   %2 = getelementptr [8 x i8]* %arg_ptr, i64 1
   %3 = bitcast [8 x i8]* %2 to i32*
   %4 = load i32* %3, align 4
   %5 = getelementptr [8 x i8]* %arg_ptr, i64 2
   %6 = bitcast [8 x i8]* %5 to float**
   %7 = load float** %6, align 8
   %8 = getelementptr [8 x i8]* %arg_ptr, i64 3
   %9 = bitcast [8 x i8]* %8 to float**
   %10 = load float** %9, align 8
   %11 = getelementptr [8 x i8]* %arg_ptr, i64 4
   %12 = bitcast [8 x i8]* %11 to float**
   %13 = load float** %12, align 8
   %14 = sext i32 %1 to i64
   br label %L0

Now, these pointer (%7,%10,%13) are not qualified with 'restrict' and the loop vectorizer gives me the same message:

LV: We can't vectorize because we can't find the array bounds.
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.

I asked this a few days ago; now it comes up again: Is there a way to qualify a pointer/Value to be 'restrict'?

Another possible solution would be telling the loop vectorizer that it's safe to treat all arrays as disjunct. Is this possible?

Frank

Bingo! That works (when coming from C source)

Now, I have a serious problem. I am not coming from C but I build the
function with the builder. I am also forced to change the signature
and
load the pointers a,b,c afterwards:

define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 {
entrypoint:
   %0 = bitcast [8 x i8]* %arg_ptr to i32*
   %1 = load i32* %0, align 4
   %2 = getelementptr [8 x i8]* %arg_ptr, i64 1
   %3 = bitcast [8 x i8]* %2 to i32*
   %4 = load i32* %3, align 4
   %5 = getelementptr [8 x i8]* %arg_ptr, i64 2
   %6 = bitcast [8 x i8]* %5 to float**
   %7 = load float** %6, align 8
   %8 = getelementptr [8 x i8]* %arg_ptr, i64 3
   %9 = bitcast [8 x i8]* %8 to float**
   %10 = load float** %9, align 8
   %11 = getelementptr [8 x i8]* %arg_ptr, i64 4
   %12 = bitcast [8 x i8]* %11 to float**
   %13 = load float** %12, align 8
   %14 = sext i32 %1 to i64
   br label %L0

Now, these pointer (%7,%10,%13) are not qualified with 'restrict' and
the loop vectorizer gives me the same message:

LV: We can't vectorize because we can't find the array bounds.
LV: Can't vectorize due to memory conflicts
LV: Not vectorizing.

I asked this a few days ago; now it comes up again: Is there a way to
qualify a pointer/Value to be 'restrict'?

Currently, no. There will be work in that direction soon. You'll need to extract a sub-function so that you can put 'noalias' on the function arguments.

Another possible solution would be telling the loop vectorizer that
it's
safe to treat all arrays as disjunct. Is this possible?

Yes. Look for llvm.mem.parallel_loop_access in the language reference.

-Hal

Thanks for the alternatives!

I am trying the 'extracting sub-function' approach. However, it seems I can't get the 'subfunction' to pass the verifier. This is my subfunction:

define void @main_extern([8 x i8]* %arg_ptr) {
entrypoint:
   %0 = getelementptr [8 x i8]* %arg_ptr, i32 0
   %1 = bitcast [8 x i8]* %0 to i64*
   %2 = load i64* %1
   %3 = getelementptr [8 x i8]* %arg_ptr, i32 1
   %4 = bitcast [8 x i8]* %3 to i64*
   %5 = load i64* %4
   %6 = getelementptr [8 x i8]* %arg_ptr, i32 2
   %7 = bitcast [8 x i8]* %6 to float**
   %8 = load float** %7
   %9 = getelementptr [8 x i8]* %arg_ptr, i32 3
   %10 = bitcast [8 x i8]* %9 to float**
   %11 = load float** %10
   %12 = getelementptr [8 x i8]* %arg_ptr, i32 4
   %13 = bitcast [8 x i8]* %12 to float**
   %14 = load float** %13
   call void @main(i64 %2, i64 %5, float* %8, float* %11, float* %14)
   ret void
}

Looks good to me. However the verify pass fails:

/svn/llvm/include/llvm/Support/Casting.h:97: static bool llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = llvm::GlobalVariable; From = llvm::GlobalValue]: Assertion `Val && "isa<> used on a null pointer"' failed.

I have no idea what this tries to tell me. Any idea?

Frank

Thanks for the alternatives!

I am trying the 'extracting sub-function' approach. However, it seems
I
can't get the 'subfunction' to pass the verifier. This is my
subfunction:

define void @main_extern([8 x i8]* %arg_ptr) {
entrypoint:
   %0 = getelementptr [8 x i8]* %arg_ptr, i32 0
   %1 = bitcast [8 x i8]* %0 to i64*
   %2 = load i64* %1
   %3 = getelementptr [8 x i8]* %arg_ptr, i32 1
   %4 = bitcast [8 x i8]* %3 to i64*
   %5 = load i64* %4
   %6 = getelementptr [8 x i8]* %arg_ptr, i32 2
   %7 = bitcast [8 x i8]* %6 to float**
   %8 = load float** %7
   %9 = getelementptr [8 x i8]* %arg_ptr, i32 3
   %10 = bitcast [8 x i8]* %9 to float**
   %11 = load float** %10
   %12 = getelementptr [8 x i8]* %arg_ptr, i32 4
   %13 = bitcast [8 x i8]* %12 to float**
   %14 = load float** %13
   call void @main(i64 %2, i64 %5, float* %8, float* %11, float* %14)
   ret void
}

Looks good to me. However the verify pass fails:

/svn/llvm/include/llvm/Support/Casting.h:97: static bool
llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To =
llvm::GlobalVariable; From = llvm::GlobalValue]: Assertion `Val &&
"isa<> used on a null pointer"' failed.

I have no idea what this tries to tell me. Any idea?

That's a bug (you're hitting an internal assertion failure). You could try removing one instruction at a time to try and narrow it down and/or file a bug report.

-Hal

I am jumping around in memory here. Funny, this appeared after 'jumping to my subfunction':