Loop unroller fails to unroll loop

Hi, I have run into a loop optimization problem(code linked at the end). I have a function(“gpu_kernel”) that calls another function(“advance”) containing a loop with '#pragma unroll’ set. The loop exit condition is a function parameter but a compile time constant (10 and 12 for both calls respectively) hence the compiler should have flattened the loop completely but we get 2 partially unrolled loops.
I found that 2 instances of FullLoopUnroll pass are being called on (once before the function is inlined and once after). Before inlining FullLoopUnroll can’t make out the TripCount variable (LoopUnrollPass.cpp:1231) and ends up partially unrolling the loop while also setting “llvm.loop.unroll.disable” metadata hence disabling it for further attempts by loop unroller.

However this is only true when “pragma unroll” is set. When it’s removed the first attempt at FullLoopUnroll bails out ( LoopUnrollPass.cpp:computeUnrollCount returns false). After inlining, the loop exit condition which was a function parameter is seen as constant and the Unroller fully unrolls the loop.

I don’t see the full picture hence I’m failing to understand why FullLoopUnroller ran before inliner? Consuming compile time as well as doing sub-optimal optimization, at least in this case. What changes can be made to fix this issue?

Reproducer: https://godbolt.org/z/16M1Ps1sK

Assuming your description is correct, this is a bug in the unroller. The entire point of the LoopFullUnrollPass is that it only performs full unrolling (and peeling), but not any kind of runtime/partial unrolling. It sounds like the presence of unroll metadata (from the pragma) forces unrolling in LoopFullUnrollPass even though no full unroll is possible. Instead, this should be delayed to the runtime unroller.

LoopFullUnrollPass is doing runtime unrolling forced by ‘pragma unroll’. As both LoopUnrollPass and FullUnrollPass rely on the same unroll function I’m not entirely sure how it can be solved. I made some changes locally that fixed it by introducing a new parameter to tryToUnrollLoop() [link]. @nikic what do you think about the changes?

Ignoring style nits, the approach looks reasonable to me.

Great! I’ll submit it for review with some tests.

Submitted a patch for review D148071. I’m not confident that I have added the right reviewers. @nikic can you add relevant reviewers if I missed them?