Background
LLVM LTO constructs a symbol table for an LLVM IR file by scanning its contents. This process includes a scan of module-level inline assembly for referenced symbols and symbol versions. This process involves parsing the inline assembly.
The target CPU and feature set are both inputs to the asm parse, and these inputs can arbitrarily influence asm parsing. Unfortunately, the IR symbol table is produced from an IR module in a wide variety of contexts, only some of which have access to the actual CPU and target feature set. For this reason, the ModuleSymbolTable summarily constructs an assembly parser using the empty string as both CPU and feature set.
This results in LTO scan of module-level inline assembly does not respect CPU · Issue #67698 · llvm/llvm-project · GitHub, which produces spurious error messages whenever features not in the base target are used in module -level inline assembly during LTO. Since the parse fails, any referenced symbols are also not included in the symbol table for the module.
Proposal
I looked into fixing this today by trying to extract CPU and feature information from the environment, but it seems to be a broadly expected property of the architecture that a module has a symbol table that can be extracted with little supporting information. That seems like a very desirable property to maintain.
Accordingly, it seems like the most straightforward fix this would be to bundle the information necessary to produce a symbol table into the LLVM IR somehow. The smallest version of this would be just the CPU name and a feature string, so I’d propose these as something to accompany module-level inline asm at the top level.
This, of course, opens up a whole can of worms, since the CPU and feature strings could conflict, either between linked modules or between a module and the flags of the link step. It also doesn’t seem desirable to build something much more heavyweight than the value gained; but the status quo here does seem quite surprising.
Another, hackier, idea would be to have clang etc. encode this information into the inline assembly itself using something like a .llvm-cpu
or .llvm-features
directive. This would be more limited in scope and much easier to merge (just string concatenation on the inline asm), but I haven’t thought much about this one yet; just something that occurred to me as I was typing this up.
Anyway, I just wanted to put this out to see if 1) if this issue was known and/or discussed somewhere I couldn’t find, and 2) if it seems like this semantic wrinkle is worth ironing out, given the options available.