Using GOTPCREL relocations for personality and typeinfo eh_frame references?

eh_frame sections emitted by Clang use an indirect encoding for the personality pointer and typeinfo references (to avoid text relocations). I noticed a few potential issues with these references; see Compiler Explorer for an example:

  • The typeinfo pointers (.L_ZTIi.DW.stub in my example) are emitted directly under the .data section. As far as I understand, this means they won’t be deduplicated across compile units. I don’t think they need to be writable either.
  • The personality pointer (DW.ref.__gxx_personality_v0) is emitted in its own COMDAT section, so it’ll be deduplicated, but it’s still being emitted in the writable .data section.
  • On aarch64, the indirect references to these pointers are encoded as sdata8, because code and data can be up to 4 GB away even in the small code model.

Locally, I’ve experimented with using GOTPCREL relocations for the personality and typeinfo references inside eh_frame, instead of creating our own stubs. For AArch64, the GOTPCREL relocation type was defined in [aaelf64] Define GOT-Relative data relocation by PiJoules · Pull Request #223 · ARM-software/abi-aa · GitHub (to support the relative vtables work); x86-64 and RISC-V also support an equivalent relocation type, and I believe R_ARM_GOT_PREL serves the same purpose for 32-bit ARM. In all cases, the relocation causes a GOT entry to be created for the referenced symbol, and the relocated place is filled in with the 32-bit offset to the GOT entry, which is compatible with the DW_EH_PE_indirect encoding being used. I believe this solves all my issues:

  • We remove all duplication by having a single GOT entry.
  • The GOT is read-only after relocation, which removes a writable indirect pointer.
  • We can switch aarch64 to sdata4, since the offset to the GOT entry is 32-bit. I’m not actually 100% sure about this; the AArch64 SysV ABI says that the “definition of the text segment” (which is limited to 2 GiB in all code models) “includes the shareable PLT, code and read-only data sections”. The GOT is only read-only after relocations are applied, so I’m not completely sure it counts, but the GOTPCREL relocation relies on this, so I assume it’s okay.

I’ve successfully prototyped this for our Android arm64 applications. I’ve observed no runtime issues, and we reduce both binary size and the number of dynamic relocations (which is important for startup time). My prototype-quality patch can be seen here; it’s limited to aarch64 under an option to ease testing, but in theory it should apply to any architecture which supports a GOTPCREL-like relocation.

Are there any problems with this approach that I’m not considering? If not, are there any objections to adding an option to change Clang’s eh_frame emission to use GOTPCREL relocations? I do believe we’d want to make this optional, because e.g. I don’t believe the bfd linker supports GOTPCREL relocations for aarch64, but we could possibly default to it under certain circumstances (e.g. when targeting Android, where LLD is the only supported linker, or when using relative vtables, which also rely on GOTPCREL relocations being supported).

1 Like

it’s still being emitted in the writable .data section

Presumably it wouldn’t be too hard to change this to .data.rel.ro, which solves the “writable” thing to the extent it can be solved.

Reducing dynamic relocations is nice, though.

I think normally the linker will choose the layout you want? But the code model documentation should be updated.

2 Likes

I think it’s useful to have a mode to replace the GOT-like usage with a GOTPCREL relocation.

DW.ref.__gxx_personality_v0: was picked likely to match GCC generated code. With a different symbol name, when mixing GCC and Clang relocatable files, this can result in duplicates, which isn’t an issue if you use Clang exclusively.

The writable section prevents copy relocations when the referenced symbol is defined in another dynamic shared object (DSO).

1 Like