We have been working on implementing ELF’s
.addrsig format – originally outlined by @pcc here – for the MC Mach-O backend. However, we’ve run into some issues getting it to work properly. The Mach-O streamer seems to be structured quite differently from the ELF one, and our experience working with MC is limited, so I was hoping we could get some tips on how to handle this. I also have some suggestions for alternative formats of encoding the data, which might be simpler to implement.
First, a quick recap of the format: We have symbol table indices stored as as ULEB128-encoded values in an
__llvm_addrsig section. These indices tell the linker that those symbols should never be merged during code folding. Since we are using ULEB128, the size of this addrsig section depends on the actual index values, so we can’t determine its final size until we have assigned those indices.
The problem with Mach-O addrsig generation is that
MCAssembler::layout() runs before symbol table indices are assigned; this happens in
layout needs to know the size of the sections to work properly, leaving us in a bit of a circular bind.
I think MC’s ELF backend seems to sidestep this problem because it computes section offsets separately from MCAsmLayout, but the Mach-O backend doesn’t seem to do this. I might be missing something obvious though given my unfamiliarity with the code. Pointers will be highly appreciated!
Possible Alternative Encodings
I’ve been wondering if we could mark address-significant symbols within the symbol table itself, instead of using an auxiliary section. In particular, we could set a bit in the
nlist::n_desc field for this. Alternatively, we could maybe reuse the
REFERENCED_DYNAMICALLY bit and combine it with
N_PEXT to indicate symbols whose addresses are significant, but which can still be removed by
llvm-strip. Either way, this will probably need Apple’s stamp of approval. @cachemeifyoucan, do you have any thoughts on this?
Make sure I understand your problem correctly. The current implementation after ⚙ D123751 Implement support for __llvm_addrsig for MachO in llvm-mc is not entirely correct because the layout might not be correct because new data are added after we finish
Personally, I am also not an expert in MC layer. I would prefer a solution with
n_desc value but since we already used up all the fields there, we need to really think carefully about how we can add it without breaking the tools that doesn’t know about the new combination of bits. I will talk to our linker team to see if they have any thoughts there.
For a quick experiment. Maybe you can sort the symbol table differently by putting address significant symbols first? Then you just need to encode very few number for start and end index so you don’t need to use ULEB128?
The current implementation after D123751 Implement support for __llvm_addrsig for MachO in llvm-mc is not entirely correct because the layout might not be correct because new data are added after we finish
Close. Right now we aren’t setting the size of
__llvm_addrsig at all, so
layout thinks it has zero size, and we emit an invalid object file if there are any sections following it. (It seems that only DWARF sections are being emitted afterward, so the problem doesn’t crop up unless debug info is added.) ⚙ D127637 [MC] Ensure addrsig's section size is properly set was my attempt to calculate the section size, but then I realized that the point where we create the
addrsig section, all
sym->getIndex() returns is zero because the symbol indices haven’t been assigned. To sum up, we don’t have enough information to calculate the size correctly within
Maybe you can sort the symbol table differently by putting address significant symbols first? Then you just need to encode very few number for start and end index so you don’t need to use ULEB128?
Hmm, that’s an interesting idea. We would have to have separate ranges for local vs external symbols, of course, but it seems workable.
I will talk to our linker team to see if they have any thoughts there.
Awesome, would love to hear their thoughts!
Some quick feedback from our linker team:
- Using a symbol table bit is definitely a more reasonable solution than a ULEB encoded list
- Get a new bit is a quite tricky especially you are currently uses it as a “Don’t fold” bit. Because if you don’t read the bit from an old object file, that doesn’t mean you can fold the function. The reverse will make more sense but we still need to work out the design
ld64 currently only dedups weak functions and dedup is also very expensive. It would be interesting to find the balance between how many functions we check and code size optimized.
What we did for
.cg_profile which also used to store symbol table indices is to instead store
relocations to the function symbols: ⚙ D104080 [LLD][LLVM] CG Graph profile using relocations. This was because any tools that modified the symbol table would break these tables since they won’t be updated (and would solve it for
.addrsig as well) but also gets rid of the dependency problem here.
The cost is that the linker needs to be updated to read relocations instead of ULEB128 and
.addrsig size goes up by 3x. However, this section is only used by the linker and then discarded for the final binary so this isn’t a big deal.
However, this section is only used by the linker and then discarded for the final binary so this isn’t a big deal.
(FWIW, object size is a big deal for Google at least - the number of bytes of object files can be a scaling factor/problem for our larger programs - I imagine that was probably part of why the ULEB encoding was chosen for ELF/by Google. Would be sort of a pity if the MachO choice had a significantly different tradeoff, but if it’s necessary… )