[RFC] Removing MCJIT and RuntimeDyld

Hi All,

As foretold at the 2023 developer’s meeting:

The time has come to discuss removal of MCJIT and RuntimeDyld from LLVM!

MCJIT and RuntimeDyld continue to receive occasional bug fixes, but have not been actively developed since 2016. While they have been used successfully in a number of production environments, they also contain fundamental design issues (e.g. limited error handling, no laziness, no concurrent JITing) that would be difficult to address without destabilizing changes.

ORC and JITLink were developed as replacements for MCJIT and RuntimeDyld to address these design issues. They are actively developed and have also been used successfully in production environments.

Given that ORC and JITLink aim to provide a superset of MCJIT and RuntimeDyld’s features I think that we should plan to remove MCJIT and RuntimeDyld from LLVM in the future: there’s no reason to keep these more limited APIs around indefinitely.

In this thread I’d like to discuss the plan for deprecation and removal of MCJIT and RuntimeDyld. I expect this to be a long process: We’ll need to give existing clients advanced warning and sufficient time to migrate, and to establish conditions for the eventual removal of the old APIs (e.g. how stable do the new APIs need to be? What target and feature support do they need, and what can we do without?).

I’ll get the ball rolling below with some background and a status update on the projects, plus (in a new thread) a more modest proposal to add deprecation warnings to MCJIT immediately to help clients (who may not follow these forums) find this discussion.

6 Likes

Proposal to add deprecation warnings to MCJIT and RuntimeDyld: [RFC] Add deprecation warnings to MCJIT

1 Like

Background on MCJIT and ORC

Feel free to skip this message if you’re familiar with the APIs.

MCJIT was introduced in 2011 as LLVM’s second generation JIT API. It aimed to maximize re-use of the static compiler pipeline by replacing the bespoke backend of LLVM’s original JIT APIs with a novel “JIT linker” system called RuntimeDyld. MCJIT could transform IR into relocatable object files in memory using the existing pipeline, and then use RuntimeDyld to link these in-memory object files to produce runnable code. This re-use of the existing pipeline simplified the JIT implementation and eliminated some classes maintenance and bugs (e.g. JIT’d instruction encoding errors) from LLVM. MCJIT’s success led to it being ported to many targets, and adopted widely in both internal and external projects, including LLDB (for expression evaluation), OpenCL, Julia, Cling, Numpy, Halide, Clasp, and many others.

ORC was introduced in 2015 as LLVM’s third generation of JIT API. It aimed to provide a replacement for MCJIT that could also support lazy compilation, concurrent compilation, and the full suite of object format features (including things like thread local storage that require runtime support). In the years since its introduction ORC has developed its own JIT linker (JITLink) and runtime library to support the features mentioned above, though it still allows use of RuntimeDyld as a fall-back to support older targets.

While ORC is not a drop-in replacement for MCJIT (it doesn’t implement the ExecutionEngine interface), the LLJIT class that ORC provides has a similar API and many clients have migrated from MCJIT to LLJIT without undue difficulty.

Project status:

MCJIT and RuntimeDyld receive occasional bug fixes, but have not been actively developed since 2016. While they have been successfully used in production environments they also contain a number of known bugs and deficiencies that would be difficult to fix without destabilizing changes.

ORC and JITLink are actively developed and have been successfully used in production (e.g. in Xcode Previews, Julia, Cling, Clasp, and others).

Target support:

ORC/JITLink’s target support largely overlaps MCJIT/RuntimeDyld’s, but each has some unique targets:

Supported in both MCJIT and ORC:
Format Architectures
COFF x86-64
ELF aarch32, aarch64, i386, LoongArch, PPC64, RISCV, x86-64
MachO aarch64, x86-64
Supported in ORC/JITLink only:
Format Architectures
ELF LoongArch, RISCV
Supported in MCJIT/RuntimeDyld only:
Format Architectures
COFF aarch32, aarch64, i386
ELF MIPS, PPC32, SPARC
MachO aarch32, i386

Adding support for the missing targets to JITLink should be relatively straightforward: new target support is a GSoC-sized or smaller project, depending on the target and the relocation / code model(s) to be supported.

Feature support:

MCJIT currently supports a couple of object format features that ORC does not:

  1. OProfile support. (Perf and Intel OTune profiling are already supported in ORC)
  2. GNU ifuncs.

Neither of these should be prohibitively difficult to implement.

ORC provides a number of features that MCJIT either does not provide or does not implement fully, including:

  1. Native thread local storage*.
  2. Static initializers and deinitializers in out-of-process mode.
  3. Exception handling in out-of-process mode.
  4. Language metadata (e.g. for Swift and Objective-C).
  5. System loader API emulation (e.g. dlopen, dlsym, etc.)

* Native TLS is supported on MachO, and the General Dynamic model is supported on ELF.

1 Like

Conditions for removal of MCJIT and RuntimeDyld

This is where I expect the interesting discussions will happen.

I can think of a few conditions off the top of my head:

  1. Ensure that ORC/JITLink provides the target and feature support necessary for almost all clients* to start migrating
  2. Give clients reasonable time to complete migration
  3. Aim to make ORC/JITLink’s interface stable enough for clients to live on (at least via the C API) without an excessive maintenance burden

Very notably: LLDB and other internal clients would need to be migrated to ORC before we could remove MCJIT.

What do you think? Do you have any points that you would like to add?

* almost all clients: We should make a good faith effort here, and I think it’s likely to be enough for all clients, but I don’t think we need to commit ourselves to any heroics: if a small number of clients have use-cases that prove extremely difficult to support that shouldn’t prevent deprecation or removal of MCJIT and RuntimeDyld.

1 Like

Yes, though I think it would be useful for this discussion to understand how many open source users there are and whether they already have ORC support or have work in progress. One I’m aware of is Mesa, where helpfully it seems ORC support was merged a couple of weeks ago and there’s some movement in the direction of enabling it by default.

One annoying issue we’ve found when switching julia to JITLink is orc::InProcessMemoryMapper makes too many memory mappings · Issue #63236 · llvm/llvm-project · GitHub. The gist of it is that the memory allocator for the JIT doesn’t make continous mappings causing it to allocate potentially tens thousands of pages. While this isn’t that much memory it blows past the (unfortunately low) max vm.max_map_count on some machines.

1 Like

Tangential question: what will happen to the ExecutionEngine interface? As far as I see, without MCJIT, the only implementation will be the interpreter – which itself isn’t well-maintained either. Will these be kept or is the plan to eventually remove these as well?

(An even more tangential comment: I do have some ideas for a better interpreter, but don’t expect to have time to implement it myself. Maybe I’ll find someone else who is interested.)

Thanks for all the work and the long deprecation period.

Is there a reliable method to determine whether obsolete APIs are being used in a large codebase without in-depth knowledge of JIT technicalities?

For instance, is there a list of headers that, if excluded, would likely indicate no reliance on obsolete APIs? I assume that the following two might indicate reliance on obsoleted APIs.

llvm/ExecutionEngine/MCJIT.h
llvm/ExecutionEngine/RuntimeDyld.h

It seems that bcc/bpftrace need migration. Trivial bpftrace patch: codegen: Remove unused MCJIT.h include by MaskRay · Pull Request #3362 · bpftrace/bpftrace · GitHub

Agreed. Gathering this information is one of the aims of [RFC] Add deprecation warnings to MCJIT and RuntimeDyld. Without automated telemetry we’ll be relying on JIT API clients to let us know their status.

I suspect the LLVM Weekly will help here too given its wide readership. Would you mind mentioning the deprecation/removal and directing people to this discussion? :slight_smile:

If you’re a JIT API client reading this thread, we’d love to hear about the status of your project:

Are you still on MCJIT?
Have you already migrated to to ORC? How did it go?
Are you In the process of migrating? How is it going? What would make the process easier?

1 Like

Re: orc::InProcessMemoryMapper makes too many memory mappings · Issue #63236 · llvm/llvm-project · GitHub – I think this should be straightforward to fix.

I don’t know whether that issue represents a regression from MCJIT, but we’ll want a way to tag issues that do represent regressions for the sake of this discussion: I’ve created an orc-mcjit-regression tag (Issues · llvm/llvm-project · GitHub) that we can use for that.

Tangential question: what will happen to the ExecutionEngine interface?

This is an important question for deprecation! MCJIT instances are never created directly. Instead, API clients use an EngineBuilder object to create an ExecutionEngine, and MCJIT is the default implementation type. We can’t use a warning on MCJIT because it would only trigger inside LLVM, never in client code (and wouldn’t be visible at all to clients consuming LLVM binary libraries).

I think we’ll need to apply the deprecation warning to either ExecutionEngine or EngineBuilder, probably with a message to explain exactly what’s being deprecated and a link to the removal discussion.

I think that we should deprecate both ExecutionEngine and EngineBuilder and aim to remove both of them, but want to gather more feedback before making a final decision on that. At the least we should add a deprecation warning to EngineBuilder along the lines of:

“MCJIT is deprecated. Please move to ORC or the Interpreter. See discussion at tiny.url/xxxxxx”*

(Interpreter users will find this confusing, but they’re exceedingly rare and we can clarify the situation for them quickly in the discussion: “If you’re already using the interpreter you can silence this warning by doing X”)

Tangential question: what will happen to the ExecutionEngine interface?

This question is very relevant to the deprecation discussion, so I decided to partially address this over in
[RFC] Removing MCJIT and RuntimeDyld - #11 by lhames.

I’m inclined to remove both ExecutionEngine and EngineBuilder since, as you point out, the Interpreter would be the only implementation of them.

I don’t have strong feelings about whether we keep the interpreter or not, but I think the question is independent of MCJIT removal: The interpreter has its own distinct purpose and use-cases. For now I think we should assume that we’re keeping the Interpreter as an independent system and only removing MCJIT.

Is there a reliable method to determine whether obsolete APIs are being used in a large codebase without in-depth knowledge of JIT technicalities?

I’m not aware of any simple deprecation warning that we could apply to MCJIT only (see discussion in [RFC] Removing MCJIT and RuntimeDyld - #11 by lhames).

I think that static analysis would be reliable, but I don’t think we can assume that clients would run it.

For instance, is there a list of headers that, if excluded, would likely indicate no reliance on obsolete APIs? I assume that the following two might indicate reliance on obsoleted APIs.

llvm/ExecutionEngine/MCJIT.h
llvm/ExecutionEngine/RuntimeDyld.h

This is a great idea. Projects will definitely need to stop including them before MCJIT is removed (whether or not they’re actually using MCJIT), and in the common case MCJIT users will include them. (A few clients might include ExecutionEngine only and get their MCJIT instance from some precompiled library, but I expect that that’s rare).

Well, I guess calling it a regression is not correct. We have a custom memory manager in julia julia/src/cgmemmgr.cpp at master · JuliaLang/julia · GitHub. Though it doesn’t implement the JITLink interface

I believe Swift is also using MCJIT for “interpreter mode” (running Swift scripts, ala swift script.swift). We’ll want to keep an eye on this too.

@al45tair Swift migrated to orc a few years ago [Immediate] Switch immediate mode from MCJIT to LLJIT. by lhames · Pull Request #29863 · swiftlang/swift · GitHub.

Interesting. It’s definitely using RuntimeDyld though, at least on Windows, because I just fixed a bug that was stopping it from working.

Yep – we’re still using RuntimeDyld (via the RTDyldObjectLinkingLayer) for some targets on ORC. In the near future we’ll want to switch to JITLink (via ObjectLinkingLayer) on all available platforms by default so that clients have time to live on it before we remove RuntimeDyld.