LLVM log file

Dear all,

Good morning. I want to know whether LLVM creates any log file consisting of applied optimizations in the optimization phase. It will be really useful for the researchers who work on compilers, formal methods, etc.

Thanks,
Sudakshina

I don’t know if it’s exhaustive but there’s the “remarks” feature:

https://llvm.org/docs/Remarks.html#introduction-to-the-llvm-remark-diagnostics

I used “-print-changed”, “-print-before-all”, “print-after-all” last time I wanted to see the passes together with their inout/output IR modules.

In my case, I used them through “clang++”, i.e. I had to prefix them with “-mllvm”

Hi Sudakshina,

Not really sure what you mean by “applied”, so, let me offer some more ideas other than Brian’s and Adrian’s great suggestions. First, there are some
diagnostics / remarks flags in Clang like the -R family [1] or some -f flags about printing optimization reports [2] from Clang. They can be useful or useless depending
on your case. They can also be parsed relatively easily.

If you just want to see a list of passes that were attempted in your code, you can do it with: -mllvm -opt-bisect-limit=-1
You can also use -mllvm-debug-pass=Arguments to see the arguments that were passed.

Moving into opt, you can use something like print-after-all, which was already mentioned. If you don’t know what these flags do, is they show you
the IR in different stages in the pipeline (e.g., print-after-all shows you each pass attempted and how the IR is after it).

Hope it helps,
Stefanos

[1] https://clang.llvm.org/docs/ClangCommandLineReference.html#diagnostic-flags
[2] https://clang.llvm.org/docs/UsersManual.html#cmdoption-f-no-save-optimization-record

Στις Κυρ, 24 Ιαν 2021 στις 5:47 π.μ., ο/η Adrian Vogelsgesang via llvm-dev <llvm-dev@lists.llvm.org> έγραψε:

Dear all,

In the optimization phase, the compiler applies some optimization to generate an optimized program. The optimization applied in the optimization pass depends on the source program; hence, the number of optimizations applied differs from source program to source program. By mentioning “applied” transformation, I wanted to know what all transformations are applied for a specific input program when subjected to the LLVM optimizer.

Thanks,
Sudakshina

Hi Sudakshina,

The optimization applied in the optimization pass depends on the source program; hence, the number of optimizations applied differs from source program to source program.

“applied” is still ambiguous, at least to me. If by “applied” you mean “attempted”, then no, that does not depend on the source program. It depends on the optimization level (e.g., O1, O2, …) or the individual passes that you may request yourself.
That is, for -O1 for example, there is a predetermined sequence of passes that attempt to optimize the program and you can see that with the options I mentioned above (e.g., -mllvm -opt-bisect-limit=-1)

If by applied you mean “actually changed the code”, then yes, this differs from program to program. You can see that with print-changed, it’ll show you the IR after every transformation that changed your program.

Finally, if you want to see why a transformation could or not change the code, you can use the related comments about remarks.

Best,
Stefanos

Στις Κυρ, 24 Ιαν 2021 στις 7:24 π.μ., ο/η Sudakshina Dutta <sudakshina@iitgoa.ac.in> έγραψε:

Dear Stefanos,

Thank you for your reply. It helped me to understand the optimization phase of LLVM. However, I did not find any ‘print-changed’ option for llvm. Can you kindly help me in this regard ? I want to generate the IRs after each optimization pass.

Regards,

Sudakshina

Hi Sudakshina,

Glad it helped :slight_smile:

I did not find any ‘print-changed’ option for llvm.
Hmm… I can’t reproduce it myself right now either… Anyway, let’s go with what works for sure. That’s -print-after-all. This prints the IR after every (middle-end) pass, no matter whether the pass made any changes or not.

Alright, now to use that: This is not an option of Clang (or the Clang driver; i.e., the command: clang test.c -print-after-all won’t work), but an option of opt. opt, in case you’re not familiar with it, is basically the middle-end optimizer of LLVM
i.e. it’s supposed to be doing target-independent optimizations, which from what I understand is that you’re interested in. If you want to also print back-end passes (i.e., register allocation etc.), that’s another story.

Anyway, when you type e.g., clang test.c -o test, 3 high-level steps happen:

  1. Clang parses, type-checks etc. the C source and generates (very trivial) LLVM IR
  2. This is then passed to opt, which issues a bunch of passes (again, depending on whether you used -O1, -O2 etc.). Each pass takes IR and outputs IR
  3. When it’s done, it passes it to the back-end, which is another story and uses another IR.

Now, what you want is to get the IR after the first step, so that you can pass it yourself to opt, with any options you want (one of them being print-after-all). To do that, you type e.g.,: clang test.c -o test.ll -S -emit-llvm
-emit-llvm tells Clang to stop at step 1) and output the IR in a text file (note that we used no -O1, -O2 options because we want the fully unoptimized, trivial IR at this step)
-S tells clang to print in textual format (let’s not go now into what is the other format)

If your file can’t be linked to an executable (e.g., it doesn’t have a main()), you should add a -c there.

Alright, now we have our IR but there’s a problem. Our functions have the optnone attribute, which tells the optimizer to not touch them (that’s because we used no optimization options).
We don’t want that, so we add another option to clang, -Xclang -disable-O0-optnone

So, all in all, it looks something like this: clang test.c -o test.ll -c -emit-llvm -S -Xclang -disable-O0-optnone

Now, we have the LLVM IR in test.ll and we can pass it to opt. You can now say: opt test.ll -O3 -print-after-all

Let me show you the steps in Godbolt:

  • Generate (unoptimized) IR from Clang: https://godbolt.org/z/rY3Thx (note that I also added -g0 to avoid printing debug info, which are probably not helpful)
  • Copy this exact IR and pass it to opt: https://godbolt.org/z/7cjdcf (you can see on the right window that SROA changed the code a lot)

As you can understand, you can automate all that with a script.

Final comment: After step 1), it’s useful to pass your IR through opt with the option: -metarenamer or -instnamer. This message has already become way too big so let me not explain now why it’s useful, but trust me, it is.

Best,
Stefanos

Στις Τρί, 26 Ιαν 2021 στις 4:18 π.μ., ο/η Sudakshina Dutta <sudakshina@iitgoa.ac.in> έγραψε:

These debug options are available from clang when prefixed with -mllvm (so -mllvm --print-after-all here).

Right, I missed that but…

In general, I think it’s quite important to understand sort of how it all comes down to opt, otherwise it seems like “magic” and if something goes wrong, well tough luck.

It’s even more important in this topic given that very many times the -mllvm variant doesn’t work. Meaning, not just random things I have tried, but even things thatother people have tried successfully and recommend.

In any case, thanks for pointing it out.

Kind regards,
Stefanos

Στις Τρί, 26 Ιαν 2021 στις 6:32 π.μ., ο/η Mehdi AMINI <joker.eph@gmail.com> έγραψε:

I think this is sufficiently close to being true that it ends up being very misleading. I've seen a lot of posts on the mailing lists from people who have a mental model of LLVM like this.

The opt tool is a thin wrapper around the LLVM pass pipeline infrastructure. Most of the command-line flags for opt are not specific to opt, they are exposed by LLVM libraries. Opt passes all of its arguments to LLVM, clang passes only the ones prefixed with -mllvm, but they are both handled by the same logic.

Opt has some default pipelines with names such as -O1 and -O3 but these are *not* the same as the pipelines of the same names in clang (or other compilers that use LLVM). This is a common source of confusion from people wondering why clang and opt give different output at -O2 (for example).

The opt tool is primarily intended for unit testing. It is a convenient way of running a single pass or sequence of passes (which is also useful for producing reduced test cases when a long pass pipeline generates a miscompile). Almost none of the logic, including most of the command-line handling, is actually present in opt.

David

Hi David,

Sure I agree in part but “very misleading” is a strong statement and I think we might want to see a different perspective. People present this mental model in conferences [1]
“If you want to optimize, you use opt, which takes LLVM IR and generates LLVM IR”. We could say that this is misleading too, as it seems that opt does the optimization, but is it really?
This an “Introduction to LLVM”, going into specifics at this point would probably confuse people a lot more than it would help them.

Same here [2]. Chandler was using clang -O2, then he used opt -O2 mentioning it as the “default” pipeline. Wow, that should be super misleading, but again is it really? Would it help
if Chandler stopped and said “Oh by the way… Let me digress here and explain a thing about libraries and opt etc.”

With the same logic, when I said “target-independent” optimizations, that was misleading too.

But my message was like, 1 page long already and I think spending another 1 page (or more) to explain such things would not help. And I think this is similar for people in conferences and llvm posts.

Anyway, it’s obvious that your message came with good intentions and I appreciate that. But I think it should be mentioned that for any beginner trying to understand the beast called LLVM,
it’s not super important to know it now, at least IMHO. But it’s good to be mentioned that it’s not the grand truth so that they come back to it when they’re more comfortable with the “approximation”.

Best,
Stefanos

[1] https://youtu.be/J5xExRGaIIY?t=429
[2] https://youtu.be/s4wnuiCwTGU?t=433

Στις Τρί, 26 Ιαν 2021 στις 11:47 π.μ., ο/η David Chisnall via llvm-dev <llvm-dev@lists.llvm.org> έγραψε:

Dear Stefanos,

Thanks for all your reply. I have one simple question to ask. I am new to llvm. I know that the loop to be optimized has to be enclosed between #pragma scop and #pragma endoscop. I have seen the llvm log. It says about the attempted optimizations. My question is does the log specify the loop which is optimized ? More precisely, if there are multiple loops, does the log specify which loop has been optimized ? I request you to kindly answer the above question or give me pointers to the answers.

Thanks.
Sudakshina

Hi Sudakshina,

the loop to be optimized has to be enclosed between #pragma scop and #pragma endoscop

No it doesn’t :slight_smile: So, ok. There is a way to do loop optimizations, called polyhedral optimization and is based on a mathematical framework.
Why do I mention this ? Because in polyhedral optimization there is the term SCOP, which is basically parts of the code where polyhedral optimization shines.
The fact that you mentioned this pragma implies that you’re probably trying to do polyhedral optimization.

Now, LLVM has Polly, an infrastructure based on polyhedral optimization. Unfortunately, I’m not familiar with Polly, so if you really want to use it,
I could CC people who are more familiar with it.

However, you don’t have to use Polly to do loop optimizations. In fact, Polly is not enabled by default in LLVM (i.e., when you do -O3, Polly doesn’t run)
Classic loop optimizations implemented in LLVM, like loop-unrolling, loop-invariant code motion and all that are not based on polyhedral optimization.

So, it depends on what you’re trying to do. And what log file are you using… A couple of ways to obtain logs were mentioned, it would
be good to mention what you’re using.

Best,
Stefanos

Στις Παρ, 29 Ιαν 2021 στις 2:18 μ.μ., ο/η Sudakshina Dutta <sudakshina@iitgoa.ac.in> έγραψε:

#pragma scop/endscop are used by polyhedral source-to-source
optimizers such as ppcg[1] and OpenScop[2]. LLVM's polyhedral
optimizer Polly does not use them.

Polly has several report options available, such as
`-Rpass-analysis=polly-scops`, `-mllvm -polly-report`, `-mllvm
-polly-show`.

[1] https://repo.or.cz/ppcg.git
[2] http://icps.u-strasbg.fr/~bastoul/development/openscop/index.html

Michael

Dear Stefanos,

I want to know whether after loop optimization, llvm indicates which loop is optimized and what all optimizations have been applied successfully to the individual loop which is optimized. Suppose, the loop L1 is optimized and the optimizations t1, t2, t3 have been attempted and are actually applied also. Again, the loop L2 is optimized and the optimizations t1, t2, t3 have been attempted and only t1 is finally applied. Does LLVM output the following ?

Loop L1 : optimizations t1, t2, t3
Loop L2 : optimization t1

Thanks,
Sudakshina

Well, I think that the easiest thing to do is something like that: https://godbolt.org/z/xYWc7e
You basically make a simple regex saying show me anything that passes matching loop or licm (loop-invariant code motion) have reported.

If you worry you may have missed a pass you want with this regex, you can look either to the pipeline: https://godbolt.org/z/dvWchh (note that regex should match argument names, not actual names e.g., -licm vs Loop-Invariant Code Motion)
Or you can take a look at the list of LLVM passes: https://llvm.org/docs/Passes.html
Most loop passes have inside them so you should be mostly ok.

For larger codes, or for easier parsing, you can just use -fsave-optimization-record, like: clang test.c -c -O2 -fsave-optimization-record
It will basically show you the same things but in a more structured format and it will output a .yaml file.
You can then sort out the passes you don’t want either yourself or by using: https://clang.llvm.org/docs/UsersManual.html#cmdoption-foptimization-record-passes

If you’re not satisfied with any of that, well, the next step I think is -print-after-all (I mentioned above what it is and how to use it) because e.g., some passes may not create reports.
Ideally, you would like something that shows you only the passes that changed the IR, and there is supposed to be such a thing, -print-changed, but for some reason it
doesn’t work.

Please try those options and tell me whether they help.

Best,
Stefanos

Στις Παρ, 29 Ιαν 2021 στις 7:34 μ.μ., ο/η Sudakshina Dutta <sudakshina@iitgoa.ac.in> έγραψε: