linking individual functions in execution module

Hi,

JIT does not allow functions to call others in different modules, so
all modules need to be statically linked in a big fat module. If a
module needs to be recompiled, all the others need to as well as
relinked.

there are two ways i intend to approach this problem:

1) forget about JITing, build each module into a .bc, call gcc to
generate .soname libraries, dynamically load with dlopen() those
libraries

2) keep the modules unlinked, and when execution is required, link the
required dependencies into an execution module. When one of the
dependencies is changed, most of the modules do not need to be rebuilt
(althought i probably will need to clone the module, since the Linker
seems to be destructive of the source), they just need to be relinked
in the execution module

first approach is my fallback approach, it is obviously rock-solid,
but i don't know how much overhead this actually represents, specially
if an alternative exists

i have questions regarding 2nd approach. each of the modules will have
the functions with generated IR, and among those generated IR will be
generated CallInst and InvokeInst to functions in the other modules.
The questions would be:

A) if i use the Linker class to link two or more modules, do i need to
update the CallInst/InvokeInst or it is part of its job to take care
of those?

B) Can i safely clone the modules before linking them into the
execution module? it seems the Linker is destructive on its sources,
but i don't know how much cycles would cloning the module save, as i'm
not clear if the module generation is more expensive than the linkage
itself

any high-level recommendations and suggestions are welcome

Hi Charles,

Are you using the old JIT engine or the newer MCJIT?

In either case, external functions are resolved through the getPointerToNamedFunction method in the JIT Memory Manager. If you provide your own memory manager implementation, you should be able to link multiple modules together.

If you re-JIT a module that you have previously linked to that will obviously cause some problems, but you can probably work around that with a stub function.

-Andy

could you elaborate a little bit on that ? i was thinking in something
different; linking a cloned copy of the module rather than the module
directly. Would that work better than this approach?

thanks

What I was thinking is that if you need to link to module A to functions in module B (which you know might be re-JITed) you can have a stub function that gets used as the address called by module A and then you can use some brute force approach to maintain the actual address of the function in module B as it is re-JITed (maybe the stub could be a lightweight class with a member variable that's kept up-to-date or whatever). The problem I'm trying to solve with this approach is that once you return the address of a function from the memory manager's getPointerToNamed function, the address you return is going to be written into the JITed code as part of the linking process, so you need a central location to maintain updates to that address.

Your approach of linking with a cloned copy of the module (before JITing?) would work too. The main downside I see to that is you may end up JITing multiple copies of functions in the cloned module. That may be OK. Something very similar was done in a project I worked on here at Intel and the results were good.

Obviously it's your call as far as weighing the overhead of duplicated modules versus the overhead of maintaining stubs. In a lot of circumstances the cloning approach would be better.

-Andy

suppose module B has call/InvokeInst to calls in module A
after i clone both modules i get B' and A'
my concrete question is this:
Are there any special steps that i need to do before linking the
modules B' and A' together?

my main concern is that B' will have call/InvokeInst pointing to
module A, not A', and the linker will not be able to notice that it
should replace A' references with A

any suggestions about this are greatly welcome