ORC and MCJIT clients: Heads up, API breaking changes in the pipeline.

Hi All,

I’m going to be making some API breaking changes to the ORC APIs, and to the RuntimeDyld class (which underlies MCJIT). The changes may affect MCJIT clients but are unlikely to. Where they do the fixes are likely to be trivial. ORC clients will be affected, but the fixes should also be straightforward.

I have three upcoming changes in mind:

  1. RuntimeDyld (the linker underlying MCJIT and ORC) will now search for symbols using the SymbolResolver’s findSymbolInLogicalDylib first, before falling back to the findSymbol method if findSymbolInLogicalDylib returns a null result.

This is a step towards making RuntimeDyld behave more like a static linker: we prefer to link against symbols defined in the same “logical dylib” as the module being JIT’d, before linking against external symbols.

For clients that have not implemented findSymbolInLogicalDylib (I suspect this covers most clients) this change will have no effect: the default version of findSymbolInLogicalDylib will be called and will return null, and RuntimeDyld will fall back to calling findSymbol as it did before.

For clients that have implemented findSymbolInLogicalDylib: Beware that any symbol found via this call will now shadow symbols that might have been found via findSymbol. It is unlikely that anyone is relying on non-shadowing so I don’t expect any trouble here, but if you do see any changes in symbol resolution after this change (which I expect to land today) this is something to have in mind.

  1. MemoryManager deregisterEHFrames will be removed.

Memory managers should own their registered EH frames and deregister them on destruction. This will bring EH frame ownership in line with memory ownership: memory managers are already responsible for freeing the memory that they’ve allocated when they are destructed.

To transition, clients should track the frames registered by their class, then deregister them in the destructor. For in-tree memory managers (and RuntimeDyld itself) this bug is being tracked by https://llvm.org/PR23991 .

  1. Some operations in the ORC APIs, RuntimeDyld, SymbolResolver and MemoryManager will be modified to return Error/Expected.

MCJIT and ORC both support remote execution of JIT’d code via RPC, but there is currently no way to cleanly bail out and report failure in the case of an RPC error. Adding error returns provides the mechanism we need.

When the interfaces are changed to add error returns clients will see compile-time errors for any custom ORC-based stacks, derived symbol resolvers, or memory managers. Most clients will just be able to update their offending return types and leave their function body as-is (for Expected returns) or return llvm::Error::success() (for Error returns). Clients who are actually using the remote-JITing features will now be able to return any errors that they had previously been dropping or bailing out on. This change is being tracked by https://llvm.org/PR22612 .

I’ve held off making these changes for a while, because I know there are a lot of out-of-tree JIT clients and I want to avoid breaking your code. I think this churn is worth the pain though - the resulting APIs will have less historical baggage, be more error-proof, and enable some new features.

Please let me know if you have any questions or run into any trouble as these changes land - I’ll be happy to help out.

Cheers,
Lang.

Hi Lang, thanks for announcing. Would be great if you could send another short notice as soon as the actual patch exists.
Best, Stefan

Hi All,

Stage 1 landed just after I sent this and it looks like there was minimal fallout from that for MCJIT users, but it broke common symbol support in the ORC Lazy JIT. To fix this I’ve replaced RuntimeDyld::SymbolInfo with the lazily-materializing JITSymbol (which I’ve moved from ORC to ExecutionEngine) in r277386. Most clients won’t see any breakage due to this (neither Swift nor LLDB required updates), but ORC users will need to rename RuntimeDyld::SymbolInfo to JITSymbol to get their code compiling again. My apologies for the breakage, but this should be a big improvement in the long run: it will allow me to fix common symbol support, and should also make it possible to support weak symbols correctly.

The changes I mentioned in stages 2 and 3 in my original email are still in the works, but I don’t have schedule for them yet.

Cheers,
Lang.

Hi All,

Unfortunately I got side-tracked for a while with other work. I’m finally getting back around to this now.

A patch for part (2) of these changes is currently out for review as https://reviews.llvm.org/D32829 .

I have a side branch that contains the changes suggested in part (3) above (especially the Error-izing of the layer API) and more:

  • Changes addObjectSet and addModuleSet to addObject and addModule singular.
    Almost everyone adds single objects/modules, so this will simplify the API. The original motivation was to allow co-linking of multiple objects, but I think that can be better achieved (if/when we need it) by adding more smarts to the linking layer.

  • Removes the MemoryManager argument.
    The memory manager argument is specific to local linking layers. If you want your JIT’s base layer to ship the whole object file to the target machine (so that you can cache code there for re-use) then the memory manager arguments is useless (and confusing). Rather than pass it down as an argument, RTDyldMemoryManager’s constructor will take a GetMemoryManager argument that it will call for each new object added.

  • Adds a RemoteObjectLayer
    This can be used as the base layer of a JIT to send objects to a remote machine (enabling JIT’d code to be cached and re-used on the target, rather than having to be re-sent each time).

These changes are mostly simplifications or extensions to the existing APIs, rather than drastic changes. They will break existing clients, but the changes needed to adapt (returning your memory manager from GetMemoryManager rather than as an argument, checking your errors) are usually obvious and mechanical.

So, before I start landing these changes in tree: Would anyone like to review them on Phabricator? If so let me know and I’ll start posting them, otherwise I’ll start committing and we can discuss changes post-commit.

Cheers,
Lang.

Hi All,

Unfortunately I got side-tracked for a while with other work. I’m finally getting back around to this now.

A patch for part (2) of these changes is currently out for review as https://reviews.llvm.org/D32829 .

I have a side branch that contains the changes suggested in part (3) above (especially the Error-izing of the layer API) and more:

  • Changes addObjectSet and addModuleSet to addObject and addModule singular.
    Almost everyone adds single objects/modules, so this will simplify the API. The original motivation was to allow co-linking of multiple objects, but I think that can be better achieved (if/when we need it) by adding more smarts to the linking layer.

Sounds generally good to me (rip out the unused complexity until there’s a use case & then consider how best to support it).

  • Removes the MemoryManager argument.

The memory manager argument is specific to local linking layers. If you want your JIT’s base layer to ship the whole object file to the target machine (so that you can cache code there for re-use) then the memory manager arguments is useless (and confusing). Rather than pass it down as an argument, RTDyldMemoryManager’s constructor will take a GetMemoryManager argument that it will call for each new object added.

I haven’t really looked in detail at the APIs here - but that last sentence sounds confusing. Why would a RTDyldMemoryManager be passed a callback to get a memory manager - when it /is/ a memory manager?

Probably a case of possible naming improvements? Or maybe not.

  • Removes the MemoryManager argument.

The memory manager argument is specific to local linking layers. If you want your JIT’s base layer to ship the whole object file to the target machine (so that you can cache code there for re-use) then the memory manager arguments is useless (and confusing). Rather than pass it down as an argument, RTDyldMemoryManager’s constructor will take a GetMemoryManager argument that it will call for each new object added.

I haven’t really looked in detail at the APIs here - but that last sentence sounds confusing. Why would a RTDyldMemoryManager be passed a callback to get a memory manager - when it /is/ a memory manager?

Probably a case of possible naming improvements? Or maybe not.

Sorry - that was a typo. RTDyldObjectLinkingLayer will take the GetMemoryManager argument. :slight_smile:

  • Lang.