ORC JIT Weekly #38 -- ELF Platform and TLV support, new JITLink backends, and upcoming plans.

Hi All,

Apologies for the lack of updates – If I can’t get these emails into a regular rhythm by 2022 I’ll bite the bullet and change the name to “ORC JIT Monthly”.

There has been plenty of work since the last update in early August. Some of the highlights:

  • Peter Housel has contributed ELFNixPlatform support [1]. This is an ELF / Posix counterpart to MachOPlatform, and enables many of the same features. In particular, it enables uniform initializer and destructor support for JIT’d code, whether running in-process or out-of-process. You can test out this support by building compiler-rt and running object files under llvm-jitlink using the “-orc-runtime” option:

% xcrun cmake -GNinja

% llvm-jitlink -orc-runtime=llvm-build/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu/libclang_rt.orc.a test-objects…

Thanks very much Peter – this is a big step up in JIT functionality for ELF-based clients, and a great foundation for future improvements to ELF support.

Speaking of which…

  • Stephen Fan has just posted a review for initial support of ELF native x86-64 TLS in ELFNixPlatform in [2]. Thank you very much Stephen! We still have a way to go before we fully support ELF TLS, but this is a big step in that direction.

  • Moritz Sichert has implemented partial x86-64 TLS support in RuntimeDyld in [3]. TLS relocations are supported, but only one copy of the TLS data is allocated (due to the limitations of RuntimeDyld/MCJIT). This will allow MCJIT users on x86-64 to load objects containing TLS variables, as long as they only use a single thread. Thanks Moritz!

  • Stefan Gränitz contributed an initial ELF / AArch64 backend for JITLink in [4]. Thanks very much Stefan!

The quality and quantity of all of these contributions is impressive: ELF JIT support is quickly catching up to MachO, which should make for a very interesting LLVM 14 release!

I have also been working on some upcoming changes, though they’re not ready for the mainline just yet:

  • A new ExecutorProcessControl implementation called SimpleRemoteEPC. It will be based entirely on the SimplePackedSerialization system rather than ORC RPC, and aims to replace OrcRPCExecutorProcessControl (and OrcRPCEPCServer), and the OrcRemoteTarget Client/Server classes. When this lands (it’s probably still a few weeks away) I will remove these clases and the ORC RPC library (eliminating an ongoing maintenance burden).

SimpleRemoteEPC aims to implement all functionality via wrapper function calls by default, which should make it easy to adapt to new IPC/RPC systems: As long as you can provide a way to send and receive arrays of bytes, SimpleRemoteEPC can do the rest.

  • Once the SimpleRemoteEPC implementation lands (and the older cross-process support classes are removed) I will make some changes to the JITLinkMemoryManager API:
  • The JITLinkMemoryManagerAllocation class will be split into two classes, InFlightAllocation, and FinalizedAllocation.
  • The new InFlightAllocation::finalize method will take a FinalizationFunctions object containing a list of wrapper functions (and argument buffer ranges) to run during finalization. This generalizes the concept of finalization from “move linked bytes to the right place and apply memory protections”, to “copy the bytes, apply the protections, and do whatever else is necessary to make this allocation usable”. LinkGraphs will gain a FinalizeFunctions member accessible to plugins so that they can add their own finalize actions. Many features that currently require RPC calls (e.g. eh-frame and TLV registration) will be expressible as finalization functions, and this will allow us to reduce the number of RPC round-trips required to link an object into the executor process. In most cases I expect this to reduce to a single round trip per object.
  • A new memory allocation attribute, lifetime, will be introduced alongside the existing memory-protection attributes. Memory will be able to be allocated with either “standard” or “finalize” lifetime. A standard-lifetime segment will live until explicitly deallocated (same as the existing behavior). A finalize-lifetime segment will live until the end of the finalization process only. This will allow us to allocate segments specifically for finalization and free them as soon as they’re no longer needed.
  • The new FinalizedAllocation type (the result of an InFlightAllocation::finalize call) will be extremely cheap to store, probably the size of a uint64_t. This will reduce the cost of tracking allocated memory in the JIT, especially in cross-process JITing cases where the deallocation metadata can live in the executor.
  • The deallocate method will support bulk deallocation, reducing the number of round-trips required to deallocate whole JITDylibs (or groups of objects allocated using the same ResourceTracker).

In short, you can look forward to some API cleanup and performance improvements that should enable the next round of ORC features.

I’ll be out on vacation for the next couple of weeks, but should be able to return to this work in mid-September, and hope to have an update for you by late September.

– Lang.

[1] https://reviews.llvm.org/D108081
[2] https://reviews.llvm.org/D109293
[3] https://reviews.llvm.org/D105466
[4] https://reviews.llvm.org/D108986