Thoughts about ExecutionEngine/MCJIT interface

Hi,

I think ExecutionEngine as a common interface for both Interpreter and MCJIT is almost useless in the current form. There are separated methods in ExecutionEngine for similar or the same features provided by Interpreter and MCJIT, i.e. to get a pointer to function you should call getPointerToFunction() for Interpreter or getFunctionAddress() for MCJIT.

Personally, I’m using MCJIT and wish to have access to some methods not available from ExecutionEngine. E.g. I would like to use getSymbolAddress() instead of getFunctionAddress() sometimes as getFunctionAddress() do some additional work what I’m sure has be done already.

Maybe it’s time face the truth that Interpreter and MCJIT based solutions are not so similar and different interfaces are needed. Or maybe some unification is possible?

My propositions / discussion starting points:

  1. Expose MCJIT header in public API. It will allow users to cast ExecutionEngine instance to MCJIT instance.
  2. Separate Interpreter and MCJIT interfaces and add them to API. ExecutionEngine can still be a base class for real common part (like module list).
  3. Try to alter ExecutionEngine interface to unify common Interpreter and MCJIT features. Is it possible to have one getFunction() method?
  • Paweł

Hi Pawel,

I agree. ExecutionEngine, in its current form, is unhelpful. I’d be in favor of cutting the common interface back to something like:

class ExecutionEngine {
public:
virtual void addModule(std::unique_ptr M) = 0;
virtual void* getGlobalValueAddress(const GlobalValue *GV) = 0;
virtual GenericValue runFunction(const Function *F,
const std::vector &Args) = 0;
};

That’s the obvious common functionality that both the interpreter and MCJIT provide. Beyond that I think things get pretty implementation specific.

For what it’s worth, this is an issue that I’m trying to address with the new Orc JIT APIs. Those expose the internals directly to give you more control. If you don’t want to be able to switch the underlying execution-engine, they may be a good fit for your use-case.

Cheers,
Lang.

Another question: Lang, when do you think it’ll be ok to move it to the C Bindings?

Hi Hayden,

Which aspect are you interested in seeing C bindings for?

ExecutionEngine already has C bindings. We’d need buy-in from clients before before we considered any change that would affect the C-API, including restricting the interface, and we’d have to provide a transition plan to avoid rev-lock. That’s probably doable though: I don’t think anyone’s particularly enamored of the current interface.

We could consider exposing MCJIT functionality via the C-API, but we’d have to carefully pin down how the aspects we expose are expected to behave.

The Orc APIs are tricker. There’s two ways we could go:

(1) Try to expose the full modular JIT components concept. In this case we’d have to add a type-erasing layer (AbstractJITLayer?) that uses virtual methods to call base layers. Then we could have C API for constructing and connecting the layers. This would give C users the closest possible experience to C++ users (with a little runtime overhead for the virtual calls). On the other hand it would be more work, and we’d want to be careful about how and when we expose the layers, since they’re still so new.

(2) Pick some canonical stack* and just expose that. This would be less work, and wouldn’t risk constraining the layer implementations at all (provided you could still configure them somehow to provide the canonical stack functionality). On the other hand it’d mean C users wouldn’t get the same experience from the APIs as C++ users.

  • If we’re only going to expose one stack, it should probably be the The Lot (at least as it stands at the moment):
  1. CompileOnDemandLayer.
  2. LazyEmittingLayer.
  3. IRCompilingLayer.
  4. ObjectLinkingLayer.

Cheers,
Lang.

I was referring to the ORC API. MCJIT is already exposed via C bindings.

Your (2) sounds the easiest to me, and also seems to be more in line with the current C bindings approach (which I might add are superior to work with than the C++ API), and how I would expect an on-request JIT API to be used,

So I put my vote for #2.

Hi Pawel,

I believe that an early (working) version of Orc is in the 3.6 release, and work on Orc is continuing on trunk.

The MCJIT API (or at least the ExecutionEngine API) will stay with us for some time, but hopefully only the API. I am hoping that the MCJIT class itself can be deprecated in the future in favor of the OrcMCJITReplacement class. Given that, I’d rather not expose MCJIT directly: Any work we do there will have to be duplicated in OrcMCJITReplacement if everything goes to plan.

Orc should be a good fit for any MCJIT client: It was designed with MCJIT’s use-cases in mind, and built on the same conceptual framework (i.e. replicating the static pipeline by linking object files in memory). It’s aim is almost exactly what you’ve described: To tidy MCJIT up and expose the internals so that people can add new features.

I’d be interested to hear how you go porting your project to Orc, and I would be happy to help out where I can. What features of MCJIT are you using now? And what do you want to add? I’d suggest checking out the Orc/Kaleidoscope tutorials (see llvm/examples/Kaleidoscope/Orc/) and the OrcMCJITReplacement class (see llvm/lib/ExecutionEngine/Orc/OrcMCJITReplacement.) to get an idea of what features are available now. It sounds like you’ll want to add to this, but then that’s the purpose of these new APIs.

As a sales pitch for Orc, I’ve included the definition of the initial JIT from the Orc/Kaleidoscope tutorials below. As you can see, you can get a functioning “custom” JIT up and running with a page of code. This basic JIT just lets you throw LLVM modules at it and execute functions from them. Starting from this point, with 149 lines worth of changes*, you can build a JIT that lazily compiles functions from ASTs on first call (no need to even IRGen up front).

Cheers,
Lang.

  • As reported by:
    diff examples/Kaleidoscope/Orc/{initial,fully_lazy}/toy.cpp | wc -l

class KaleidoscopeJIT {
public:
typedef ObjectLinkingLayer<> ObjLayerT;
typedef IRCompileLayer CompileLayerT;
typedef CompileLayerT::ModuleSetHandleT ModuleHandleT;

KaleidoscopeJIT(TargetMachine &TM)
: Mang(TM.getDataLayout()),
CompileLayer(ObjectLayer, SimpleCompiler™) {}

std::string mangle(const std::string &Name) {
std::string MangledName;
{
raw_string_ostream MangledNameStream(MangledName);
Mang.getNameWithPrefix(MangledNameStream, Name);
}
return MangledName;
}

ModuleHandleT addModule(std::unique_ptr M) {
auto MM = createLookasideRTDyldMM(
[&](const std::string &Name) {
return findSymbol(Name).getAddress();
},
[](const std::string &S) { return 0; } );

return CompileLayer.addModuleSet(singletonSet(std::move(M)), std::move(MM));
}

void removeModule(ModuleHandleT H) { CompileLayer.removeModuleSet(H); }

JITSymbol findSymbol(const std::string &Name) {
return CompileLayer.findSymbol(Name, true);
}

JITSymbol findUnmangledSymbol(const std::string Name) {
return findSymbol(mangle(Name));
}

private:
Mangler Mang;
ObjLayerT ObjectLayer;
CompileLayerT CompileLayer;
};

Lang, I saw the Kaleidoscope tutorial, nice!

Now maybe we can also have the LLVM C bindings for this surface area? :slight_smile:

I’m happy to help if I am given some guidance.

Hi Hayden,

I haven’t had a chance to yet, but I’ll introduce a lazily compiling stack into LLI soon. When that goes in it might be a good candidate for exposing via the C-API.

Cheers,
Lang.