Newer OpenMP versions include a construct for loop unrolling, but,
however, I was wondering if the regular clang pragma for loop
unrolling can be safely mixed with an OpenMP loop,without using the
new OpenMP feature.
Unfortunately, this does not work, by design. In the compiler pipeline
"#pragma omp parallel for" is lowered in the front-end, while "#pragma
clang loop unroll_count(N)" is in the LoopUnroll mid-end-pass.
"#pragma omp parallel for" cannot be applied on a loop that has not
yet been unrolled in the front-end. In practice, the #pragma clang
loop will just be ignored.