Auto-vectorization option

Dear all,

Greetings. I want to generate the intermediate code (.ll file) after having LLVM auto-vectorization transformation on the source C file. I have the following code (find_max.c) on which I want to apply auto-vectorization.

int main()
{
int i, n, max, arr[50];
for(i = 0; i < 35; i++)
{
printf(“arr[%d] = “, i);
scanf(”%d”, &arr[i]);
}
max = arr[0];
for(i = 1; i < 35; i++)
{
if(arr[i] > max)
{
max = arr[i];
}
}

}

I have tried the following.

  1. clang -S -emit-llvm find_max.c -o find_max.ll
  2. opt -loop-vectorize -force-vector-width=8 find_max.ll -o find_max_opt.ll

and

  1. clang -O3 -c find_max.c -Rpass=vector -Rpass-analysis=vector
  2. clang -O3 -c find_max.c -Rpass=vector -Rpass-analysis=vector -o find_max.ll

All the above options generate binary files. I request you to please help.

Thanks and regards,
Sudakshina

Hi,

Not sure what you mean by only getting binaries: “clang -S -emit-llvm” should produce a .ll and so will opt.

clang -S -emit-llvm find_max.c -o find_max.ll

Since no optimisation level is specified, this will result in very unoptimised code, probably resulting in vectorisation not triggering. I usually do “clang -O3 -S -emit-llvm -mllvm -print-before-all”, or something equivalent, and then grab the IR just before vectorisation.

and also this:

-Rpass=vector

should be:

-Rpass=loop-vectorize

Try opt -S to get text IR instead of bitcode.

The second group of commands use clang -c so produce a compiled object file. For text IR you need -S -emit-llvm like in the first group.

Nigel

Dear all,

Thanks to all of you. I have executed the following commands on the code given above.

clang -O3 -S -c find_max.c -Rpass=vector -Rpass-analysis=vector -o find_max.ll

However, the generated code is an assembly code (attached). Is there any way to generate a vectorized IR (.ll) file ?

Thanks and regards,
Sudakshina

find_max.ll (5.79 KB)

To get LLVM IR from the frontend (Clang) use -emit-llvm -fno-discard-value-names, e.g., https://godbolt.org/z/aWz37qYdW

If you don't need debugger intrinsics (llvm.dbg.*) add -g0, e.g., https://godbolt.org/z/aWz37qYdW

As Sjoerd has mentioned, passing -mllvm -print-before-all to Clang is usedful to get pre-vectorized LLVM IR (as well as observe the effects of consecutive transformations); Example: https://godbolt.org/z/4za6h6fqo

You can then extract the unoptimized LLVM IR and play with it in "opt" (the middle-end optimizer tool) to get the LLVM IR optimized by the middle-end passes (including loop vectorizer); note that now you can just pass -print-before-all directly: https://llvm.godbolt.org/z/P7E3PGE61

In particular, the LLVM IR displayed under "*** IR Dump Before LoopVectorizePass on _Z1fPim ***" is a good baseline for comparisons.

Add "-mllvm -print-module-scope" to get the LLVM IR for the full module (translation unit): https://godbolt.org/z/Go7zK8vsW

Then, pass this LLVM (right before LoopVectorizePass) to "opt" using options "-loop-vectorize -debug-only=loop-vectorize" to observe the loop vectorization pass in action:
https://llvm.godbolt.org/z/WMa1qosoq

Note that you need a binary built with assertions enabled to use -debug options.

Last but not least you can give the optimized LLVM IR to "llc" (the backend tool) to get the final assembly: https://llvm.godbolt.org/z/hxevcqKEG

Best,
Matt

Dear Matt P. Dziubinski,

Thanks a lot for your reply. Although the vectorization is clearly visible in godbolt, I could not generate it by command line. Does it require some specific version of llvm/clang ?

Regards
Sudakshina

Dear Matt P. Dziubinski,

Thanks a lot for your reply. Although the vectorization is clearly visible in godbolt, I could not generate it by command line. Does it require some specific version of llvm/clang ?

No, only the -debug options require build with asserts; everything else should be working with a regular release of Clang/LLVM. Chances are you have to debug it (perhaps a missing side effect causing the entire loop to be optimized away, etc.).

Best,
Matt