# Loop vectorizer behaviour for 2D arrays and parallel annotation

Hello,

I am trying to vectorize the following loop but the vectorizer says:
"Found a possible write-write reorder" and does not vectorize.

Why?

for (j=0; j < 8; j++)
{
jj = j << 3;
m2[j][0] = diff[jj ] + diff[jj+4];
m2[j][1] = diff[jj+1] + diff[jj+5];
m2[j][2] = diff[jj+2] + diff[jj+6];
m2[j][3] = diff[jj+3] + diff[jj+7];
m2[j][4] = diff[jj ] - diff[jj+4];
m2[j][5] = diff[jj+1] - diff[jj+5];
m2[j][6] = diff[jj+2] - diff[jj+6];
m2[j][7] = diff[jj+3] - diff[jj+7];
}

Another question is regarding the isannotatedparallel() check. Is
there a way to make clang (or any other frontend) to generate parallel
annotated IR?

Best,

Did you try to put '#pragma ivdep' before the loop.

Tobias

P.S.: Please attach a full C file as test case. The way the different data structures are declared my influence the analysis.

Paul Redmond was adding support for "#pragma ivdep" that would use the
parallel metadata, but I haven't been able to follow its progress lately.

That is, if your loop body was an OpenCL kernel with each work-item
executing a single iteration, it *might* get "horizontally vectorized"
using the loop vectorizer if you use pocl's 'loopvec' work group method and
if the memory access pattern is suitable. This is quite fresh code which
I'm still optimizing, but I've already managed to autovectorize some work groups using it.

BR,

I'm still working on it--just slowly I'm hoping to have some more patches in the next week or two.

paul

While that is true, the debug message printed by the vectorizer is

Thanks for the suggestion, it worked using the latest llvm from svn.
Thanks Pekka and Paul for your inputs.

Tobias

PFA the example.

-Best,