MCJIT and Lazy Compilation

I'd like to see that, too. :slight_smile: Andrew, great work, it's reassuring that
someone finally takes the plunge and looks into this!

Hi Andy/Albert,

Sorry for the slow reply, my day job caught up with me.

Two bits of progress. (a) MCJIT is working nicely for non-trivial
examples in Extempore on x86 and ARM, and (b) the page
permissions are now RO again. For your amusement a very
cheesy video of Extempore running on-the-fly with MCJIT on an
ARM Pandaboard. Viewer discretion is advised!
https://vimeo.com/60407237

Here is the overview of changes I promised a couple of weeks back.
These comments are based on the 3.3 trunk of about 3 weeks ago.

โ€“ RuntimeDyld โ€“

Relocations needs to be cleared between
each emitObject.

New method required: clearRelocations

void RuntimeDyldImpl::clearRelocations() {
ExternalSymbolRelocations.clear();
Relocations.clear();
}

void RuntimeDyld::clearRelocations() {
Dyld->clearRelocations();
}

โ€“ RuntimeDyldImpl โ€“

void clearRelocations();

โ€“ SectionMemoryManager โ€“

Removed option in allocateSection to use existing โ€˜freeโ€™ memory
regions - i.e. now always allocate new memory. This means that once a
module is emitted and its memory permissions are applied we donโ€™t have
to touch it again. This means we donโ€™t have to set exec sections
writable before a new emit. Iโ€™m sure there is a nicer way to achieve
this.

โ€“ MCJIT โ€“

MCJIT currently takes a single Module and only supports a single call
to emitObject. A number of assertion and conditional checks enforce
this. These checks can all be removed (assertions and conditionals),
making M obsolete.

At the moment there are a number of call sites for emitObject. I
removed all of them and replaced them with a single emitObject call
site accessible through the extant, but unused,
recompileAndRelinkFunction method. This then becomes the clients
access point to compile each individual module. This is obviously a
hack to maintain the integrity of the existing API (i.e. bad name,
evil type munging F->M etcโ€ฆ).

void *MCJIT::recompileAndRelinkFunction(Function F) {
emitObject((Module
) F);
finalizeObject();
return NULL;
}

FinalizeObject should not call resolveRelocations. Instead it should
call the new clearRelocations.

void MCJIT::finalizeObject() {
// New Dyld call clearRelocations
Dyld.clearRelocations();

// Set page permissions.
MemMgr->applyPermissions();
}

At the moment I am just leaking allocated sections in SMM. i.e.

void MCJIT::freeMachineCodeForFunction(Function *F) {
dbgs() << โ€œfree machine code not yet supported in MCJIT\nโ€;
return;
}

But it should be relatively straight forward to maintain some kind of
Module โ†’ SMM Section map. Hotswapping currently works fine because
the relocations all update as expected. So this is just a leakage
problem.

โ€“ AsmParser/LLParser โ€“

Some forward refs in the LLParser need fixing. Iโ€™m working directly
from IR though, so C++ API people will need a different fix.

Hi Andrew, the prototype looks great! Andy Kaylor is on vacation, so I'm not sure if he'll see this thread until next week.

You definitely have MCJIT moving in the right direction, but the API you're using (ExecutionEngine) has to be stretched a little bit to accomplish your goals. In your implementation, do clients call recompileAndRelinkFunction() for every module that is needed to satisfy inter-module dependencies (if so, can the same thing be accomplished with the existing ExecutionEngine::addModule()?) or do clients call it only to update ("hotswap") a module that has been previously compiled but has since changed?

Off hand, it sounds like ExecutionEngine might need another function like:

updateModule(Module* M);

which does what your modified recompileAndRelinkFunction() does. It should be easy to add any new functions you need to the API in ExecutionEngine.h as virtuals (with default, empty, implementations in order to keep the old JIT happy) and overrides in MCJIT that do the right thing.

Regarding tests, I'm thinking that the existing lli-based MCJIT integration tests (tests/ExecutionEngine/MCJIT) are going to be harder to extend to support multiple modules (with function dependencies, globals, etc) than the unit tests (unittests/ExecutionEngine/MCJIT). Anyways, it sounds like you have already been testing lots of cases, but I'm not sure if that's been done manually or in some automatic fashion. There is a (commented-out) test in MCJITTests.cpp that intends to test the case of dependencies between modules, but there's probably lots of other cases that can be added too. In any case, I find that writing unit tests is a big help to figure out what an API should look like.

Also, I believe there's an assumption (in at least one place) that ExecutionEngines take ownership of Modules that are passed in. It sounds like you removed it (from MCJIT at least) by getting rid of the field M and the corresponding assertions, but I'm wondering if there's any other common code in ExecutionEngine that makes the same assumption about a single module whose ownership is transferred.

Can you elaborate on what you mean by "1) The current single Module case may need to be retained??". I don't see why the single-module case cannot be handled in exactly the same way as the multiple module cases.

Ashok (cc'd) has in the past looked at MCJIT memory managers, and specifically at applying permissions, so he might have some more thoughts, and I'll leave it to Andy K to review at the relocation stuff as that's his domain of expertise!

At this point though, I think lots people are excited about MCJIT (specifically, multiple modules with dependencies and hot-swapping) and would love to see a patch when you have a moment to rebase it on top of current trunk. Don't worry if the patch is incomplete; just seeing its design might help elicit some more targeted comments!

Thanks,
Dan