Question about -flto behavior

snidertm · October 18, 2021, 8:56pm

Hi All,

When -flto is combined with -S on the clang command line, the output .s file contains IR content instead of target assembly language.

Is this expected/correct behavior? I was anticipating that the output .s file would contain target assembly code.

~ Todd Snider

dblaikie · October 18, 2021, 9:20pm

Yeah, sounds expected to me - flto produces object files that aren’t really object files - instead they’re LLVM IR (bitcode) that the linker identifies, then calls back into LLVM to link the IR, optimize on that IR, then produce object code/assembly/whatever).

So the “assembly” form of an “object” (really LLVM bitcode) file is LLVM textual IR.

snidertm · October 18, 2021, 10:55pm

David,

Thanks for the reply. That makes sense.

A couple of further thoughts … In the LTO implementation that I am working on, when -flto is specified to the compiler, the compiler embeds the IR in the compiler generated object file. The linker can then read the IR out of the incoming object file if LTO is enabled at link time or just ignore the IR if LTO is disabled at link time.

I would agree that having -S write out the IR content for -flto provides a good way to see what is being fed into the LTO link in a human readable form.

For our LTO implementation, the linker can be told to keep the IR that it extracts from the incoming object files. You can then run llvm-dis over the extracted IR to see the .ll version.

~ Todd

dblaikie · October 19, 2021, 1:54am

David,

Thanks for the reply. That makes sense.

A couple of further thoughts … In the LTO implementation that I am working on, when -flto is specified to the compiler, the compiler embeds the IR in the compiler generated object file. The linker can then read the IR out of the incoming object file if LTO is enabled at link time or just ignore the IR if LTO is disabled at link time.

Fair enough - in that case, I guess you might want your compiler to generate both? It could generate x.ll for the LLVM IR and x.s for the machine assembly.

mehdi_amini · October 19, 2021, 2:53am

David,

Thanks for the reply. That makes sense.

A couple of further thoughts … In the LTO implementation that I am working on, when -flto is specified to the compiler, the compiler embeds the IR in the compiler generated object file. The linker can then read the IR out of the incoming object file if LTO is enabled at link time or just ignore the IR if LTO is disabled at link time.

Fair enough - in that case, I guess you might want your compiler to generate both? It could generate x.ll for the LLVM IR and x.s for the machine assembly.

It seems to me that such a compiler should generate an assembly file that contains the assembly as well the IR blob in a single file (that can be piped into the assembler)?

dblaikie · October 19, 2021, 4:59am

sure, possibly, if you’ve got that technology/decide to write out the IR in assembler directives in some form?

teresajohnson · October 26, 2021, 11:03pm

David,

Thanks for the reply. That makes sense.

A couple of further thoughts … In the LTO implementation that I am working on, when -flto is specified to the compiler, the compiler embeds the IR in the compiler generated object file. The linker can then read the IR out of the incoming object file if LTO is enabled at link time or just ignore the IR if LTO is disabled at link time.

I guess this approach gives flexibility to the build at the cost of extra compile time to go through the whole optimization pipeline including code generation potentially unnecessarily. Normally in LLVM we don’t go through code generation during an -flto -S or -c compile, and only part of the optimization pipeline.

I would agree that having -S write out the IR content for -flto provides a good way to see what is being fed into the LTO link in a human readable form.

For our LTO implementation, the linker can be told to keep the IR that it extracts from the incoming object files. You can then run llvm-dis over the extracted IR to see the .ll version.

Note you can emit the final machine code when building with -flto and linking with lld via -Wl,–lto-emit-asm, in which case the a.out or specified output file will contain assembly instead of ELF.

Teresa

snidertm · October 26, 2021, 11:51pm

Yes, if the end user application is building with LTO, then we are going through code generation unnecessarily during -flto compiles.

However, one of the motivations for embedding the IR in object code is so that the libc and other runtime libraries can be pre-built and shipped with embedded bitcode IR. The runtime libraries can then be linked in whether the end user application chooses to built with LTO or not.

I’m speculating that this tradeoff becomes less cost effective as the user application gets very big, but it is probably a reasonable tradeoff for embedded applications.

~ Todd

pogo59 · October 27, 2021, 1:48pm

It would be entirely possible to deliver library binaries that are IR-only. In that case, if the application build doesn’t use LTO, the linker will apply LTO only to the library, and then the link will proceed normally. Or, if the application does use LTO, the LTO process will incorporate the library code as well. You can mix-and-match objects and IR in a link.

–paulr

Topic		Replies	Views
Can llvm handle object file? Beginners	4	249	May 5, 2023
[RFC] -ffat-lto-objects support IR & Optimizations lto , thinlto	17	3301	September 8, 2022
how to generate both IR and object file Clang Frontend	4	103	June 24, 2014
[RFC] Adding driver option to emit assembly from LTO Clang Frontend	0	97	March 25, 2019
How to enable -flto in presence of hand-written assembly files (RiscV backend)? IR & Optimizations lto , riscv , clang	3	687	May 18, 2022

Question about -flto behavior

Related Topics