RFC - Adding assembly output to LLD "save-temps" with LTO enabled

Hello!

I would like to add support for saving text assembly (.s) output to LLD when “-save-temps” is passed and LTO is used. Currently, only temps up to pre-codegen bitcode are saved.

As LLVM directly emits object files, we will need to use the same trick Clang uses, that is, CodeGen twice - once for the real output and once for the .s temporary. As this is for a debug feature this shouldn’t be an issue.

Now, I’m unsure how to integrate this properly with the existing code. I would need to add new passes to emit the file in LTOBackend.cpp’s codegen function. How should this be integrated into the API? I see that the current LTO backend doesn’t know about save-temps, instead it’s being passed a bunch of hooks that do the job.

My initial idea was to add a new hook that would take the PM + TargetMachine + Module (something like a “EmitFileHook”) or add those parameters to the existing pre-codegen hook, and call addPassesToEmitFile there. Though, I’m not sure if it’s very clean to call addPassesToEmitFile twice on the same pass manager. There’s also issues with lifetime of the output stream so overall this doesn’t feel like a good implementation.

I think it should be more tightly integrated with the backend logic, like say AlwaysEmitRegularLTOObj. We could have a AlwaysEmitAssemblyOutput and handle it in the codegen function directly. It’s a bit less modular but feels cleaner. What do you think?
(Draft/WIP: ⚙ D138560 [WIP] Add assembly output to LTO save-temps )

Any suggestions are very much welcome.

Thanks

Does --lto-emit-asm work for you?

It does not work in this case. For a more specific use case: LLD is invoked by Clang, and Clang is passed -save-temps. When it doesn’t use LTO, Clang emits a.S file alongside other temporaries. When LTO is enabled, it doesn’t, and that’s what we’re looking to add

⚙ D138560 [WIP] Add assembly output to LTO save-temps has been cleaned up and I added a few additional reviewers. Unless there’s some architectural issues with the implementation I suppose it’s better discussed in the review than here