Hello I m using following polly flags,
clang -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine
clang -O3 -mllvm -polly -mllvm -polly-vectorizer=polly
clang -O3 -mllvm -polly -mllvm -force-vector-interleave=32
I am getting different execution times for each 3. Can you please tell me the difference between the optimizations performed by these 3 flags?
Please help
Thank You
Hello I m using following polly flags,
clang -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine
This prepares instructions into vector-sized chunks
(-polly-prevect-width, 4 by default), but does not generate vector
instructions itself. The LoopVectorizer can generate vector
instructions from it.
clang -O3 -mllvm -polly -mllvm -polly-vectorizer=polly
Like -polly-vectorizer=stripmine, but generates vector instructions itself.
clang -O3 -mllvm -polly -mllvm -force-vector-interleave=32
This is a LoopVectorizer option. It does not vectorize, but interleave
(a special kind of unrolling).
Michael