Is llvm capable of doing loop interchange optimization?

Denis_Bakhvalov · November 18, 2017, 3:26pm

Hello,

I've been playing around with the really simple example of not cache
friendly loop like this:

#define N 100

void foo(int** __restrict__ a,
         int** __restrict__ b)
{
    for (int i = 0; i < N; ++i)
        for (int j = 0; j < N; ++j)
            a[j][i] += b[j][i];
}

link to compiler explorer:
Compiler Explorer’)),version:4

I was thinking that modern compilers are optimizing such cases very
aggressively. I quickly checked gcc, there is special option
-floop-interchange, but it requires properly configured gcc compiler
with Graphite infrastructure enabled. I didn't have that at hand, so I
just stopped right there.

Loop has no side effects and arrays do not alias, so from my point of
view it's safe to do interchange here. I tried different march options
as well with no success.
I'm interested if llvm is capable of doing loop interchange in
general, at least the most simple cases like the mentioned one?

Florian_Hahn · November 20, 2017, 12:10pm

Hi Denis,

Hello,

I've been playing around with the really simple example of not cache
friendly loop like this:

#define N 100

void foo(int** __restrict__ a,
          int** __restrict__ b)
{
     for (int i = 0; i < N; ++i)
         for (int j = 0; j < N; ++j)
             a[j][i] += b[j][i];
}

link to compiler explorer:
Compiler Explorer’)),version:4

I was thinking that modern compilers are optimizing such cases very
aggressively. I quickly checked gcc, there is special option
-floop-interchange, but it requires properly configured gcc compiler
with Graphite infrastructure enabled. I didn't have that at hand, so I
just stopped right there.

Loop has no side effects and arrays do not alias, so from my point of
view it's safe to do interchange here. I tried different march options
as well with no success.
I'm interested if llvm is capable of doing loop interchange in
general, at least the most simple cases like the mentioned one?

There is a (relatively basic) loop interchange pass in LLVM. It is not enabled by default however, but you can enable it by passing `-mllvm -enable-loopinterchange` to Clang.

LoopInterchange in Clang does not trigger in your example, because a[j] and b[i] may alias. I am no expert on restrict, but it seems both GCC and Clang cannot prove that the access do not alias with the annotations you provided (also `int *restrict *restrict` does not do the trick).

In the code below, accesses to aa and bb cannot alias, and both GCC (with -O3) and Clang (recent master build, with `-O3 -mllvm -enable-loopinterchange`) interchange the 2 loops.

int aa[N][N];
int bb[N][N];

void foo1() {
     for (int i = 0; i < N; ++i)
         for (int j = 0; j < N; ++j)
             aa[j][i] += bb[j][i];
}

Cheers,
Florian

Denis_Bakhvalov · November 20, 2017, 12:35pm

Hi Florian,
Thank you very much for answering the question.
Looks like for now gcc does a better job with the example you provided:

Topic		Replies	Views
[RFC] LoopInterchange Pass for llvm LLVM Dev List Archives	0	105	February 5, 2015
Enabling loop-interchange Loop Optimizations	14	848	January 30, 2025
Issues with the Loop Interchange Pass LLVM Project clang , llvm	3	218	November 19, 2024
Particular type of loop optimization LLVM Dev List Archives	11	78	February 8, 2016
LLVM Loop Prefetch and Interchange Flags LLVM Dev List Archives	0	110	August 3, 2018

Is llvm capable of doing loop interchange optimization?

Related topics