[RuntimeDyld] Section memory management

Hi Lang,

I’m in the process of completing my patch to add TLS support to RuntimeDyld. The one remaining issue I have is that the memory manager sometimes allocates the .got section too far away from the code section, so I’d like to get some input on the possible options.

  1. Don’t allocate a separate GOT section, but use the Stub mechanism instead. I had considered this, but I believe, we currently set RW permission on the GOT section, so we can’t really put that in the code section without making it RWX, which has the obvious security problems. I suspect in most cases a RO GOT is fine, but I’m not sure what the consensus is here.

  2. Change the memory manager to guarantee that the got section is closer. This seems better to me, but is also not quite easy. Currently the memory manager tries very hard to put pages of one permission type closer together, but doesn’t really care at all about where e.g. code and data sections are placed relative to each other. This seems slightly backwards to me as I imagine you would the code/data/got sections of an object file to be close together if possible, while the objects files can be farther from each other (with the appropriate code model of course). Can you think of a good reason why the other way would be better? The other problem is that the GOT section size isn’t known until all the relocations are processed. The stub mechanism has a similar problem, which we deal with by just assuming all relocations require a stub when reserving memory. I suspect I could just do the same for the GOT. However, since we need to walk all the relocations anyway, wouldn’t it be better to add a method to each RuntimeDyld implementation that counts exactly how much Stub/GOT space it would need?

Let me know what you think.
Thanks,
Keno

Hi Keno,

This sounds great. :slight_smile:

Both your solutions are excellent. As long as there are no objections from other clients I think you should try the first approach first, because it's a better fit with existing code. RuntimeDyldMachO does GOT support via the stub mechanism, for instance.

I would *really* like to implement the second approach in the longer term. RuntimeDyld cuts a lot of corners that it doesn't really have to, as you've noticed. In an ideal world I'd like to see something like this:

(a) RuntimeDyld is aware of the GOT entries for every symbol. There maybe more than one GOT entry per symbol if memory layout requires it.
(b) Clients add a set of objects at a time to RuntimeDyld. When a set is added, RuntimeDyld scans all sections and relocations for all objects in the set, and determines the memory requirements for the entire set, including stubs and GOT entries. Provided they can be laid out within an accessible range (highly likely) all sections in the set can share GOT-entries/stubs. If a new set happens to be laid out in-range of an existing set they can optionally share existing GOT-entries/stubs (this should be optional, because it would introduce an implicit dependence of the first set on the second, which would preclude removing the first set independently of the second).

That's a more substantial refactor though.

Cheers,
Lang.

2. Change the memory manager to guarantee that the got section is closer.
This seems better to me, but is also not quite easy. Currently the memory
manager tries very hard to put pages of one permission type closer together,
but doesn't really care at all about where e.g. code and data sections are
placed relative to each other. This seems slightly backwards to me as I
imagine you would the code/data/got sections of an object file to be close
together if possible, while the objects files can be farther from each other
(with the appropriate code model of course). Can you think of a good reason
why the other way would be better? The other problem is that the GOT section
size isn't known until all the relocations are processed. The stub mechanism
has a similar problem, which we deal with by just assuming all relocations
require a stub when reserving memory. I suspect I could just do the same for
the GOT. However, since we need to walk all the relocations anyway, wouldn't
it be better to add a method to each RuntimeDyld implementation that counts
exactly how much Stub/GOT space it would need?

For what it is worth, this is similar to what "real" dynamic linkers
do. Running "strace ld-musl-x86_64.so.1 ./t" you will see something
like

open("./t", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\2\0>\0\1\0\0\0\343\2@\0\0\0\0\0"...,
960) = 960
mmap(0x400000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x400000
mmap(0x401000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
0) = 0x401000

Note how the entire file is mmaped R+E and then the second segment is
switched to R+W, but it is known to be right next to the previous one.

Cheers,
Rafael