linker adaptability ...

hello folks,

I'm working to add runtime updating of code to the OCaml compiler
which in its bytecode
guise presents no barrier because there is only one linker and it is
written in that language and
full control is available.
With native code on the other hand, there is reliance on the system
linker and I got completely
lost examining the GNU ld/dl library source code.
The prospect of understanding and modifying all possible linkers is
daunting to say the least.
It's finally dawned on me that LLVM might have some of what I need already.
Would you be kind enough to examine the following wish-list and
comment on it's current viability
with respect to the LLD project?

1. to load and link a set of object/archive files into an in-memory executable.
2. to track the memory allocations attributable to each contributing
object processed with a view to releasing them.
3. to maintain the symtable in-memory with a view to updating symbols
and re-patching the in-memory executable.
4. to reload an object file/archive member (or a new one) and process
it in accordance with 3.

No doubt LLD is not geared to any of this directly, but does the
library provide any support?
thanks in advance....

Sounds like you’re probably after ORC https://llvm.org/docs/ORCv2.html - a JIT infrastructure, which as you’ve described, models object files and executable code in-memory as closely as possible to the on-disk format, and supports things like replaceable code.

Hi Kris,

Dave is right: You’ll want to check out ORC and JITLink. As a starting point I’d recommend taking a look at this example that uses a JITLink plugin to render the JIT-linker’s graph data structure: https://github.com/llvm/llvm-project/blob/master/llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin/LLJITWithObjectLinkingLayerPlugin.cpp

JITLink is currently only available on Darwin, but someone just posted a review for an under-development ELF version: I expect that to land in-tree this week and quickly develop enough relocation support to handle basic use cases.

Regards,
Lang.

Hi Kris,

I’ve been trying to find a way in through ObjectFile/RTDyldObjectLinkingLayer but have been stymied by the latter’s sole focus on in-memory binaries (I believe, but I’m a C++ dunce).
So RuntimeDyld/Jitlink do have the facility to load binary formats from disk?

Short answer: Yes.

RuntimeDyld and JITLink both perform the same task: They take a buffer containing a relocatable object and apply the relocations to produce ready-to-use code for a target process (often the same process). They don’t really care where the buffer comes from: In MCJIT and ORC it’s usually produced directly in memory by compiling LLVM IR, but you can just as easily mmap a relocatable object file off disk. In fact, both RuntimeDyld and JITLink have testing tools that do exactly that: llvm/tools/llvm-rtdyld and llvm/tools/llvm-jitlink. If you have built clang and llvm-rtdyld you can run the following:

% clang -c -o foo.o foo.c
% llvm-rtdyld foo.o

The llvm-rtdyld tool will mmap foo.o, apply RuntimeDyld to it to resolve relocations, then execute the resulting fixed-up code to run the program in foo.c.

The Jitlink patch talks about ‘dead stripping’ - would that cover the reloading of an object file overwriting the previous set of symbols?
Or is there a way to unambiguously declare a set of symbols void and have Jitlink auto-reclaim the memory of the corresponding object file that supplied them?

Both systems allow you to dispose of the relocated memory for an input file but neither provide built-in support for replacing definitions. It’s possible to enable replacement of definitions with some manual work, but there are many possible approaches depending on your use-case. As an example: If you just want to replace function definitions (not data definitions) and don’t care about performance you can just emit stubs (e.g. using http://llvm.org/doxygen/classllvm_1_1orc_1_1IndirectStubsManager.html) and then update these when you load new definitions.

The Jitlink patch talks about ‘dead stripping’ - would that cover the reloading of an object file overwriting the previous set of symbols?

Dead stripping doesn’t have much to do with code replacement. It is the removal of unused symbols. For example: if your relocatable object file provides a weak definition of “foo” but a strong definition already exists then the weak definition of foo is effectively unreferenced, so JITLink will dead-strip it. This may cause other definitions to also become unused so they will be dead-stripped too, and so on.

Regards,
Lang.