Hello. I’m currently researching the register allocator and am trying to run a simple matrix multiplication c program in C. I like to unroll the inner loop using the clang pragma to force a spill from the register allocator.
My loops with the clang pragma are shown below
for( int i = 0; i < AROW; i++ ) {
for( int j = 0; j < BCOL; j++ ) {
result[i][j] = 0;
#pragma clang loop unroll(32)
for( int k = 0; k < ACOL; k++ ) {
result[i][j] += arrA[i][k] * arrB[k][j];
}
}
}
My current build process is using clang to emit the llvm code (.ll file) and then create the .s file using llvm’s static compiler llc.
An example of these two commands is shown below.
$ clang -S -emit-llvm -O3 mat_mul.c
$ llc --regalloc greedy -debug-only=regalloc mat_mul.ll > mat_mul.out 2>&1
However, it doesn’t seem that my loop is actually getting unrolled. I was wondering if there was a way to verify that the loop is getting unrolled given my current workflow. If I didn’t explain anything correctly please let me know.
Thanks.
AS