I have a complex example with several loops and branches nested in one function, it’s terminable. I want them to be fully unrolled, and achieved this by adding pragmas to each loop in C.
However, I want to change C as little as possible (in fact, I want to change nothing), and the -loop-unroll parameter does nothing when I don’t add the pragma.
I tried several unroll related passes and the loop was still not unrolled.
I compared the IR results with and without pragma, and there is no difference in their attributes and metadata.
So I would like to ask if Clang has a pipeline specifically optimized for pragma. Is the optimization process of pragma different from the usual -O2 or -O3? If the default -O2 or -O3 is used, will the structure of IR be destroyed and the unroll pass cannot identify the loop that can be unrolled? Is there any way to realize full unroll by calling pass or other methods without adding pragma?
I am not sure I understand your question correctly, could you maybe share an example? Maybe this helps: Compiler Explorer - By default, here, the compiler would not fully unroll the loop (N = 1000). However, when adding the flag -mllvm -unroll-threshold=100000
, the loop is fully unrolled, so maybe the -mllvm -unroll-threshold=...
option is the answer to your question? There are some more hidden options to the unroller, I would suggest to look at the options defined at the top of this file: LLVM: lib/Transforms/Scalar/LoopUnrollPass.cpp Source File or to search the output of opt -help-hidden
. Pragmas essentially override the defaults for the unroll pass here (and sometimes enable it at all).
It works! Thanks for your time and explanation!
Sorry to bother again, my confuse is:
what is the difference between -unroll-threshold and -unroll-count?
Besides the command line, is there any other way to change the value of threshold?
what is the difference between -unroll-threshold and -unroll-count?
The LoopUnroll Pass has a cost-model that helps decide if its profitable to unroll a loop (completely or partially) or not, which takes into consideration the loop size, costs of individual instructions, and so on. After all, not every loop benefits from unrolling and unrolling increases code size. -unroll-threshold
allows you to make the pass more or less aggressive. Estimated costs are compared with the threshold, not a number of iterations.
The -unroll-count
option is meant for testing, it overwrites the number of iterations that should be unrolled.
Besides the command line, is there any other way to change the value of threshold?
You can change the default value for the threshold here if you compile LLVM yourselve: https://github.com/llvm/llvm-project/blob/9f276d4ddd0efa2e323d674a35317c253ab66d58/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp#L171 . Other than changing the source code, using the -unroll-threshold
flag, and pragmas, I don’t think there is a way.