I _think_ that the GOT support we currently have can be made to work if
the memory manager provides the necessary help (more on that below), but I
will readily admit that it is implemented in a fairly non-standard way that
is likely to seem completely wrong on first inspection (and probably still
seems at least slightly wrong on second inspection). It may also have
inherent limitations that can’t be overcome without a redesign, but if so I
don’t know what those limitations might be.
It may be helpful to refer to the comments in my original GOT
implementation patch (
when trying to decipher the intent of the existing code as unfortunately I
seem to have said quite a bit more there than I did in the actual code
I’m pretty sure that the “multiple GOT” patch was intended to support the
case where additional modules are loaded after finalizeLoad() has been
called. It looks like we were at some point trying to use a single GOT for
all modules, but once it had been “finalized” another GOT had to be created
for subsequent loads. It’s been a while since I looked at this code, but I
believe that we defer calculating the offsets for the GOT until a
“finalize” is performed. This is because the memory for loaded sections
may be remapped before that time to handle remote (or out-of-process)
execution. It appears that we are also deferring allocation of the GOT
section memory until this time.
We call finalizeLoad for every object I believe, so we essentially end up
with one GOT per object file anyway. We're deferring filling (though not
allocating) the GOT until we call resolveRelocations.
With regard to the 2 GB+ offset problem, we’re dependent on the memory
manager in that regard. Even with a single object being loaded there is no
guarantee that the memory allocated for the GOT section will be within 2 GB
of the memory allocated for other sections unless the memory manager does
something to make it so. An interface was added sometime in the past year
(I think) that optionally pre-calculates the amount of memory that will be
needed for an object load so that the memory manager can allocate all of
this memory as a single block. I’m not sure this interface properly
accounts for the possibility of GOT sections and I don’t know how it works
with multiple modules.
While this is true, it's actually not the case I'm worried about. The case
I'm worried about is where we load enough object files to exhaust 2GB worth
of objects (this doesn't even have to be 2GB worth of code, for example I
hit this with msan). The current interface basically forces all code to fit
within two GB, which is precisely what the GOT is supposed to avoid.
Just to be very explicit, the case I'm concerned about is
- Allocate Object file 1 with GOTPCREL to `foo`
- [ Allocate 2GB worth of other data ]
- Allocate Object file 2 with GOTPCREL to `foo`
Object file 2 will reuse Object file 1's GOT (though we'll still allocate
space in object file 2's GOT, so it's not like we're doing this to save
The default memory manager attempts to use system address hints to
allocate sections in the same region of the address space, but not all OSs
support the flags we’d like to use and the address requests are never
guaranteed to be respected. FWIW, Address Sanitizer is very good at
exposing issues of this sort.
Yes, I agree this is a concern, though it seems solvable to always allocate
one ObjectFile within 2GB, while it doesn't necessarily seem right to
impose this to impose the restriction that all code ever loaded has to fit
I should also mention that there is some variation in how GOT-related
issues are handled from architecture to architecture within
RuntimeDyldELF. When I implemented the GOT support, I intended for it to
be capable of supporting any architecture, but there was some support for
GOT-related relocations for non-x86 platforms that pre-dated my GOT
implementation and I suspect those will continue to be used as long as they
are working correctly. For instance, several architectures extended the
allocated size of code sections and use the extra space at the end of the
section to create stubs for PC-relative function calls.
Yes, I've seen this code.
Let me know if there’s anything more I can do to help you get things
Thanks for replying. I have a half-way functioning prototype that makes
GOTs local to each object file again and also deduplicates GOTEntries where
possible. I'll finish it up and post it here as soon as I can.