Verify loop unrolling to force spilling in register allocator

Hello. I’m currently researching the register allocator and am trying to run a simple matrix multiplication c program in C. I like to unroll the inner loop using the clang pragma to force a spill from the register allocator.

My loops with the clang pragma are shown below

for( int i = 0; i < AROW; i++ ) {
        for( int j = 0; j < BCOL; j++ ) {

            result[i][j] = 0;

            #pragma clang loop unroll(32)
            for( int k = 0; k < ACOL; k++ ) {

                result[i][j] += arrA[i][k] * arrB[k][j];


My current build process is using clang to emit the llvm code (.ll file) and then create the .s file using llvm’s static compiler llc.
An example of these two commands is shown below.

$ clang -S -emit-llvm -O3 mat_mul.c
$ llc --regalloc greedy -debug-only=regalloc mat_mul.ll > mat_mul.out 2>&1

However, it doesn’t seem that my loop is actually getting unrolled. I was wondering if there was a way to verify that the loop is getting unrolled given my current workflow. If I didn’t explain anything correctly please let me know.



It is always helpful to provide the full program so the people that you ask do not have to write the context to test this out. Also consider godbold links.

Trying this, I get:

<source>:7:39: error: invalid argument; expected 'enable', 'full' or 'disable'
    7 |             #pragma clang loop unroll(32)
      |                                       ^

After changing it to loop unroll_count(32), remarks show:

<source>:8:13: remark: unrolled loop by a factor of 32 with run-time trip count [-Rpass=loop-unroll]
    8 |             for( int k = 0; k < ACOL; k++ ) {
      |             ^
<source>:8:13: remark: unrolled loop by a factor of 32 with run-time trip count [-Rpass=loop-unroll]


Hi @jdoerfert. Thank you for the prompt response. I didn’t know of compiler explorer so I will definitely use that to write down my program there and see how it goes.