Some basic questions regarding MCJIT and Kaleidoscope sample

Hi,

I am building a new JIT compiler for Lua (actually a derivative of
Lua), and am planning to use LLVM for this. I have trying out some
basic functions using LLVM 3.5.1. I have been puzzled by one aspect of
the MCJIT versions of the Kaleidoscope sample, and would hugely
appreciate some insight.

Can a single MCJIT instance be used to manage several modules?
Why is a separate MCJIT instance created for each module?

Unfortunately the tutorial text does not explain the rationale for this.

At the moment in my implementation I am putting each compiled function
in its own module and creating an engine for it. This is so that each
function can be managed independently and linked to garbage collection
in Lua. However I do not know if there is a better way - for example
creating a shared engine across all modules. I would also like to
understand the trade offs with this approach versus other approaches.

Many thanks

Regards
Dibyendu

We (and by ‘we’ I mean ‘Lang’ (cc’d)) have recently been experimenting with improvements/replacements to the MCJIT in the form of Orc, a new layered, composable JIT API. There are examples of this in the Kaleidoscope code samples (llvm/examples/Kaleidoscope/Orc) though not comprehensive tutorials as yet.

HI Dibyendu,

A single MCJIT instance can notionally manage multiple modules, but there are caveats (which I’m afraid I don’t remember off the top of my head) that make it unattractive in practice. I believe most clients opt for something like the ExecutionEngine-per-Module model used in the Kaleidoscope tutorials.

As Dave mentioned, I’m also working on some new JIT APIs (Orc) that are intended to be more feature-rich and easier to use than MCJIT. If you would like to try them out and have any questions about how they work please don’t hesitate to ask. You’ll need to live on the bleeding edge for those though - they’re not available on 3.5.

Cheers,
Lang.

Hi David,

We (and by 'we' I mean 'Lang' (cc'd)) have recently been experimenting with
improvements/replacements to the MCJIT in the form of Orc, a new layered,
composable JIT API. There are examples of this in the Kaleidoscope code
samples (llvm/examples/Kaleidoscope/Orc) though not comprehensive tutorials
as yet.

Yes I have noticed that there is a replacement for MCJIT using Orc.
But I assume this is still early days for the new api? When will a
production release be available?

Will Orc support the use case described in my post - i.e. - functions
independently JITed and garbage collected? In Lua, function calls go
via the runtime so I don't need to link functions. Later on I will
look at inlining functions to be able to bypass the Lua runtime when
it makes sense.

The other requirement I have is to be able to access standard Lua
functions (not dynamically linked) from within the JITed function. I
discovered that MCJIT has the bug due to which addGlobalMapping() does
not work. So I am using sys::DynamicLibrary::AddSymbol() to register
the Lua functions.

BTW a newbie question - once I have compiled a function - is there a
need to retain the module and engine - apart from the fact that the
memory allocated to the JITed function is presumably managed by the
engine?

Thanks and Regards

Dibyendu

Hi James,

A single MCJIT instance can notionally manage multiple modules, but there
are caveats (which I'm afraid I don't remember off the top of my head) that
make it unattractive in practice. I believe most clients opt for something
like the ExecutionEngine-per-Module model used in the Kaleidoscope
tutorials.

I see. That's good to know.

As Dave mentioned, I'm also working on some new JIT APIs (Orc) that are
intended to be more feature-rich and easier to use than MCJIT. If you would
like to try them out and have any questions about how they work please don't
hesitate to ask. You'll need to live on the bleeding edge for those though -
they're not available on 3.5.

I am at an early stage of development so I can use Orc provided there
is some assurance that this will be mainstream in future. Else I can
use the MCJIT replacement that uses Orc. If I wanted to do this should
I clone the latest code from github?

In any case I am abstracting the creation and management of engines,
modules etc in my code so that I can change this in future if
necessary.

Thanks and Regards

Dibyendu

Hi Lang,

Sincere apologies for the name typo below!

Regards

Hi Dibyendu,

Sincere apologies for the name typo below!

No worries. :slight_smile:

Regarding the choice of JITs: MCJIT is more mature, and we have an ongoing commitment to support it. That’s not the case for Orc, which is still experimental. Since you’re writing an abstraction layer anyway you could choose either approach. They share a lot of fundamental ideas, and it should be reasonably easy to move from one to the other later. Two big points of distinction: MCJIT has some basic thread-safety, Orc does not (though I’d love to see it added). Orc has some in-tree support for lazy-compilation, which MCJIT does not (you’ll have to build that on top of MCJIT yourself).

You mentioned being able to garbage collect functions: Orc lets you remove Modules from the JIT out-of-the-box (see HandleTopLevelExpression in the Kaleidoscope tutorials for an example), in MCJIT you would do it implicitly by throwing away the MCJIT instance (and the associated RTDyldMemoryManager).

BTW a newbie question - once I have compiled a function - is there a
need to retain the module and engine - apart from the fact that the
memory allocated to the JITed function is presumably managed by the
engine?

There’s no fundamental barrier to throwing away the ExecutionEngine. The memory allocated to JIT’d functions is managed by whatever RTDyldMemoryManager you build the ExecutionEngine with. You could write a custom RTDyldMemoryManager that allowed you to take ownership of the allocated memory before the ExecutionEngine was destroyed. Note however that each ExecutionEngine holds the symbol table for the modules it JITs. If you throw away the execution engine you’ll want to make sure that you’ve already got all the function/global pointers you need out of it.

Cheers,
Lang.