As part of our work on LLD for Mach-O, we’ve observed that the object files produced by LLVM don’t always have aligned nlist_64 entries. For context, nlist_64 is the 64-bit symbol table entry structure for Mach-O, and the symbol table is an array of nlist_64 entries. nlist_64 has an 8 byte member, so it should be 8-byte aligned, but we’ve seen object files where the symbol table only has a 4-byte alignment.
I don't know if Mach-O mandates an alignment for the symbol table (I haven't found any such requirements in the documentation), but it would be convenient for MC to emit an 8-byte aligned symbol table. We parse an input file in LLD by memory mapping it and using casts to access the various structures. When the symbol table isn't 8-byte aligned, we're accessing the nlist_64 entries in an unaligned fashion, which makes UBSAN (rightly) unhappy. As far as I can see, ld64 also does pointer arithmetic and casting to parse the symbol table, so it would run into the same issue. Does anyone see any problems with making MC always emit an 8-byte aligned symbol table for 64-bit Mach-O files, to avoid this issue?
Of course, we should still handle object files we see in the wild without the 8-byte alignment. From what I understand, the blessed way to handle a potentially unaligned access is with a memcpy, and the compiler should be able to optimize out the memcpy on architectures that support unaligned accesses. A coworker is implementing that approach in ⚙ D80414 [lld-macho] Ensure reads from nlist_64 structs are aligned when necessary, but before we go ahead with that, I wanted to confirm (a) how reliable the memcpy optimization is across all our supported compilers (in particular, I've seen MSVC struggle with this optimization in the past), because we really don't want to be paying the cost of an actual memcpy in this codepath, and (b) if there's other good ways to handle this that we could explore.