[ThinLTO] Dealing with injected symbols

Came across this issue when linking bitcode object files generated by rustc.

The object files have the following attribute attributes #0 = { ... "probe-stack"="__rust_probestack" ...}. __rust_probestack is a function for stack protection, and it is only generated during IR lowering, https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86FrameLowering.cpp#L1136C43-L1136C43

lld does not see the symbol during thin-link but after LTO, the symbol shows up in the native object and needs to be resolved, but the object file that defines it is not picked by LTO.

In general, any symbol that is not seen from bitcode but shows up in native object can trigger this issue. There are several ways to work around this from user side, like making it undefined (“-u __rust_probestack”), or moving the object file that defines the symbol out of lazy/thin archive. But I wonder if there is a way to address this from compiler side?

lld does not see the symbol during thin-link but after LTO, the symbol shows up in the native object and needs to be resolved, but the object file that defines it is not picked by LTO.

Slight correction: it is actually lld that does the object file picking (LTO only looks at the bitcode object files lld picks).

This is an expected issue for symbols not present during the LTO link, since there is simply no way to know about them. The solution would either be to use -u as you mentioned, or somehow inserting the symbol earlier. The latter solution was used for a symbol needed for CSPGO instrumentation which is otherwise inserted in the LTO backends. See https://github.com/llvm/llvm-project/blob/985362e2f38f0e1d70a3067851ae072ac11ffb33/llvm/include/llvm/Transforms/Instrumentation/PGOInstrumentation.h#L34-L45.

You are right. I think it is hard to enforce everyone that creates new symbol to do so.