I was using "clang -O3 -S -emit-llvm" got some very optimized output.
Then I did "clang -S -emit-llvm" (without optimization) and wanted to optimized the code in a
separate pass. The llvm program "opt" did not do anything.
How can I invoke the optimizer on some un-optimized program, possibly showing the output of each optimizer stage?
I would like to get a deeper understanding of the optimization pass.
In addition to Sam's advice, I want to point out that clang's IR generator doesn't necessarily emit the same code when compiling for optimization and not. Most obviously, we never emit available_externally function definitions at -O0 because we assume that would just waste compile time.
OK, I am looking for "LTO"/global optimization. So the function definition will remain "somewhere else" (externally), and the optimizer will find in some other module to possibly inline it later on.
I am planning to concat all the *.ll (eg "link" the files) and pass them to the "global" optimizer, as "size" is a very important optimization criterium to me. After that, the back-end will be invoked.
Is that a good approach?
OK, I am looking for “LTO”/global optimization. So the function definition will remain “somewhere else” (externally), and the optimizer will find in some other module to possibly inline it later on.
I am planning to concat all the *.ll (eg “link” the files) and pass them to the “global” optimizer, as “size” is a very important optimization criterium to me. After that, the back-end will be invoked.
Is that a good approach?
How about just add -O4 on clang (or llvm-gcc) command line to get LTO optimizations ?
OK, I am looking for “LTO”/global optimization. So the function definition will remain “somewhere else” (externally), and the optimizer will find in some other module to possibly inline it later on.
I am planning to concat all the *.ll (eg “link” the files) and pass them to the “global” optimizer, as “size” is a very important optimization criterium to me. After that, the back-end will be invoked.
Is that a good approach?
How about just add -O4 on clang (or llvm-gcc) command line to get LTO optimizations ?
My backend doesnt supply lto I am trying to emulate lto like this:
The backend (or code generator) does not do LTO in this case also.
Roger, but the effect is equivalent to lto, isnt it?
As I understand, lto-enabled GCC does the same (save as compiled code and IR) and the linker will optimize over that IR (if present) and then invoke its backend.
I was playing with gcc-lto on linux last year, but it didnt work very well for me at that time. So when I saw llvm, that is what came to my mind first: emulated lto due to the IR!