MCJIT and Lazy Compilation

Does anyone have a roadmap for MCJIT with what I think people are
calling lazy compilation.

Is this even on the cards?

I spent the last few hours moving my project (extempore.moso.com.au)
over to MCJIT (particularly for ARM), and am a little horrified to discover
no ability to compile, and just as importantly to recompile, at a function level.
This is absolutely mandatory for my project.

I have been looking enviously at MCJIT’s ARM+DWARF support for a
couple of years and was under the misapprehension that MCJIT was
attempting to be a drop-in replacement for JIT. So I wasn’t overly
concerned about the primary JIT being largely neglected. This is obviously
my fault, I wasn’t paying close enough attention.

I am now wondering what the LLVM project, in the large, plans regarding
just-in-time compilation moving forward. Is MCJIT the future, and
if so what kind of roadmap is there to replicate current JIT functionality.
In my case in relation to function level (re)compilation.

I appreciate everyones efforts, and that we all have our own agendas.
I’m just trying to put my own roadmap in place.

Cheers,
Andrew.

I apologize to everyone for the tone of this email.
I had a bad day yesterday.

I’ll start having a look at how I might be able to

contribute to MCJIT.

Cheers,
Andrew.

While you're waiting for others to reply, I would suggest to google
this list's archives for MCJIT discussions. Just within the past year
there were many, including some which, IIRC, discussed exactly this
issue of lazy compilation, dynamic function loading and so forth.

Eli

I apologize to everyone for the tone of this email.
I had a bad day yesterday.

FWIW I didn't detect any bad tone when I read the email. You are
certainly not the first to run into this issue and have similar
assumptions about MCJIT (even just yesterday: see the thread "Stub
function generation in MCJIT using lli"). And even before that this
has come up numerous times.

I'll start having a look at how I might be able to
contribute to MCJIT.

Awesome! Please don't hesitate to ask for help on the list; this is a
recognized deficiency and there are certainly others that would
greatly appreciate your work.

-- Sean Silva

Eli, do you happen to know if we have a PR tracking this issue? (you
seem like you would be the person to know)

-- Sean Silva

I used to hack on MCJIT in the past while being at Intel. Since I
moved to a different job I've been focusing on other parts of LLVM and
haven't been really in touch with the MCJIT parts for a long time.
Andy Kaylor (the MCJIT code owner, CCd) is a better candidate to give
a meaningful answer to the OP, at least from his perspective.

Eli

Ah, ok. I think I was basing myself off of the dev meeting where you
talked about it.

-- Sean Silva

As far as I know, there is no tracking issue for it. It's definitely something that I'm acutely aware of, but it would be good to have a Bugzilla report.

-Andy

You are probably the best qualified to file a useful PR about this.

-- Sean Silva

Well, this issue comes up on the ml every once in a while, here are two
recent threads FYI:

http://groups.google.com/group/llvm-dev/browse_thread/thread/464f5c59d457d168/85a4d042d45acdda?#85a4d042d45acdda

http://groups.google.com/group/llvm-dev/browse_thread/thread/aee3a51120a97245/77c94d9c5df2572c

I'm waiting for MCJIT to grow the dragon wings as well. :slight_smile: I'm not into
the MCJIT sources, but I'd be more than happy to help testing the
lazy/incremental compilation stuff if you or someone else embarks on
adding this to the MCJIT.

Albert

Hi Andrew,

I was about to write a belated reply to this message (sorry for the delay), but then I realized that pretty much everything useful that I have to say on the subject is contained in this message (which is in a thread Albert Graef already linked to):

https://groups.google.com/d/msg/llvm-dev/Rk9cWdRX0Wg/Fa1Mn6cyS9UJ

Generally, I do hope that MCJIT will be capable of replacing the old JIT someday soon, though obviously it cannot do so until it provides equivalent functionality. I doubt it will ever be a “drop-in” replacement, but I hope that minimal rework will be needed. Most significantly, as can be seen in earlier discussions, things will need to be made Module-centric rather than Function-centric. It ought to be possible to write a utility class that takes a monolithic Module and breaks it up into sub-Modules for individual functions, but I think that would need to happen outside of the MCJIT engine because not all clients would want that kind of granularity.

There’s definitely a lot of work to be done here to get this right, and hopefully we’ll get active participation in any design discussions to make sure the solution meets everyone’s needs. I don’t have a time table for this right now. I will file a Bugzilla report as soon as the LLVM server is ready.

-Andy

Hi Andrew,

I was about to write a belated reply to this message (sorry for the delay), but then I realized that pretty much everything useful that I have to say on the subject is contained in this message (which is in a thread Albert Graef already linked to):

https://groups.google.com/d/msg/llvm-dev/Rk9cWdRX0Wg/Fa1Mn6cyS9UJ

Generally, I do hope that MCJIT will be capable of replacing the old JIT someday soon, though obviously it cannot do so until it provides equivalent functionality. I doubt it will ever be a “drop-in” replacement, but I hope that minimal rework will be needed. Most significantly, as can be seen in earlier discussions, things will need to be made Module-centric rather than Function-centric. It ought to be possible to write a utility class that takes a monolithic Module and breaks it up into sub-Modules for individual functions, but I think that would need to happen outside of the MCJIT engine because not all clients would want that kind of granularity.

There’s definitely a lot of work to be done here to get this right, and hopefully we’ll get active participation in any design discussions to make sure the solution meets everyone’s needs. I don’t have a time table for this right now. I will file a Bugzilla report as soon as the LLVM server is ready.

-Andy

Hi Manny,

What exactly do you need from the JITEventListener API?

The latest MCJIT code has a listener interface that will notify you when an object image is emitted. I recently committed changes to DebugInfo and the IntelJITEventListener that demonstrate a way to deconstruct the emitted object into function and line number information. However, there is no clean way to get back to the original llvm::Function that corresponds to any given machine code function.

As for your problem with re-jitting modules, I think it should be possible to use a second instance of MCJIT to emit code for a second module that contains just the function with new parameters and link that against the previously emitted module. I haven’t tried this, so I don’t know what pitfalls may be there, but I don’t see an obvious reason that it couldn’t work. You would probably need a custom memory manager to handle the function address resolution. I realize this isn’t an ideal solution. I’m hoping that in the near future a single instance of MCJIT will support multiple modules and be able to do this kind of cross-module linking in a more painless way.

-Andy

Thanks for the update Andy.

I’m very happy to be involved in anyway that is helpful. If you would like me to test ideas, or contribute to further discussions, then please let me know.

I currently have extempore running nicely with MCJIT for the “monolithic” case and am working on various LLVM hacks to better understand the issues involved with non-monolithic approaches - in particular I’m starting with your multi-module approach. I will report back when (and if) I have something useful to contribute.

Cheers,
Andrew.

That’s awesome!

I think at this point having people try out various approaches and seeing what works and what doesn’t is our biggest need in this area. Please do keep me informed about what you find out.

-Andy

OK, so I have some preliminary results, which are on the whole quite encouraging!

I haven’t had a great deal of time, but I have managed to get Extempore up and
running with function (actually lexical closures so composed of quite a bit of additional
guff) level compilation using Andy’s multi module suggestion. I also have on-the-fly
recompilation of existing closures working (caveats below) so from an end-user
perspective this means that Extempore appears functionally equivalent with MCJIT
and the old legacy JIT - hot-swapping audio signal processing code on-the-fly using
MCJIT for example.

Firstly multi-module definitely proved to be considerably easier than attempting to hack

solutions for incremental monolithic module builds - which I also investigated.

So the only major obstacle that I have run into so far are page permissions in relation
to code relocations. I have a safe hack which is to toggle section permissions between
rw and exec/ro in-between new object injections - however this is obviously problematic
for code that is executing concurrently (i.e. secondary threads). I also have an unsafe
hack, (purely for experimentation :slight_smile: whereby exec sections are left rw, and although
very evil it works for test purposes (i.e. the audio example mentioned above). These
solutions are obviously both inappropriate and I will investigate a real solution when
I find some time.

Also I didn’t bother to implement section erasure, at the moment I’m just allocating
new sections for each compile regardless of whether the new code replaces existing
functionality. Having said that I don’t see this as much of an issue, I was just to
lazy to bother implementing it. I’ll check this when I have some further free time.

FYI this is all under x86. I did try to run under ARM but bombed out on an assertion error
in the ARM ELF relocation code - specifically assert((*TargetPtr & 0x000F0FFF) == 0);
I assume this is a result of something evil that I have done but I haven’t yet had time to
investigate any further. Again I’ll let you know when I have some more time.

Just a quick heads up but In general my initial thoughts are that MCJIT is really not
that far off.

Cheers,
Andrew.

Hi Andrew,

This is very cool! Thanks for the update.

-Jim

This is great news.

Do you have any dependencies between your modules? For instance, one calling a function in another? If so, how did you handle that?

Any chance you could share some code snippets or the relevant portions?

-Andy

Hey Sean,

I’m sure Andy.K will have some thoughts on this but
I don’t imagine any major API changes being required.
The changes that I have been making are really very
minor, all the work is already done.

However I can imagine some client side fallout due to the
multi-module nature of the proposed solution. The client
side problem being that I imagine that many people are
using a single monolithic module for various bookkeeping
purposes on the client side - function signature lookups
for example.

One slightly dubious way around this might be to have a
monolithic module ‘just for bookkeeping’ which is managed
persistently in ‘parallel’ to the individual object modules -
which I’m currently throwing away after each “compile”.
Ultimately though it’s probably better for the client to manage
this kind of bookkeeping outside of LLVM anyway?

Another solution might be to add some additional meta-data
to the runtime memory manager - although I can’t imagine
people liking this idea very much as it would be a fairly
gross violation of the current mem-mgr interface.

Anyway, I’m sure that Andy will have a much better handle
on what solutions might or might not be appropriate.

Cheers,
Andrew.

Hey Andy,

Yep I’ve tested some non-trivial examples with loads of dependencies,
both code and data, global, local and external symbol resolution etc…

Actually this was truly a piece of cake, nothing to do, the memory manager
is working really nicely so far as I can tell. Relocations to sections are all working
as expected (aside from previously mentioned ARM issue which is probably just
something that I’m doing wrong) with all global symbol relocs managed persistently
by the MM between object injections. All in all it just works :wink: I had to make a few
minor adjustments to things like the LLParser for forward dependencies but overall
really simple stuff.

There certainly are some section management issues that will need to be addressed,
but I don’t see any major hurdles there. I was going to take a look into this next week?

The biggest issue for multi-module is probably going to be client side not LLVM side,
although this has not been a huge problem for me as most of this bookkeeping is
already managed “client side” in extempore.

I’m happy to send you code although it might be more useful for me to write a
followup email outlining exactly what changes were made and then let the experts
decide how best to proceed :wink: Tomorrows a little hectic but I’ll try to send a note
through on Monday.

Cheers,
Andrew.