2nd attempt for a working patch for bug 2606

This is the second version of a patch, which I recently attached to bug 2606, whose original version was
modified to reflect the lists comments. Also please note the comment at the end of this email, which basically
questions whether this bug is really a bug.

  1. To solve the foreign Module GlobalVariable problem, I modified JIT::getOrEmitGlobalVariable(…) to
    directly attempt to map a found “external” GlobalVariable.
  2. To solve the foreign Module Function problem,
    I modified JIT.{h,cpp} to emit a stub for a found “external” Function instance, when that foreign Module
    defined Function has not yet been compiled.

All areas of interest are marked with // GMV Mod:

Again I invite those members of the list that have time, to review this proposal for comments/changes.

Major mods:

JIT.{h,cpp}:

  1. I included Module.h.

  2. JIT::getPointerToFunction(…) was modified to search for a non-mapped function in all “other” modules.
    If found a stub is emitted in the same manner as other in module functions found. This stub gets re-emitted,
    toward the end of the JIT cycle.

One issue here is that functions that are external to all modules, have their mappings delayed by the
above loop. I also kept the hasAvailableExternallyLinkage() the same as before, as I’m not sure if this
should also go through the same module procedure. The stub emission will only take place for function
definitions that are marked with external linkage. Should other cases be allowed? Should visibility also
be considered?

  1. JIT::getOrEmitGlobalVariable(…) was modified to search for non-mapped global variables in all “other”
    modules. If found, the address of the foreign global variable is requested, and mapped. There is no
    delay here.

An attempt is first made to find the global outside of all modules, before attempting to find the function
in the known modules. This is the reverse of logic used in #2. On searching, unlike for functions, a linkage
check was not made. Also the global variable declaration and definition must have the exact same type.
Is this the correct approach?

  1. The declaration of void* forceEmitFunctionStub(Function *F) was added to the class JIT. This is
    implemented in JITEmitter.cpp.

JITEmitter.cpp:

  1. JIT::forceEmitFunctionStub(…) was added. It turns around and calls JITResolver::getLazyFunctionStub(…) which
    emits the foreign function as a stub.

Beyond any issues with the patch, there is a question, in my mind, as to whether 2606 is really a bug. Sure its resolution makes
using the JIT simpler for cross module behavior, but current manual solutions may be more fined grained in their approach in
deciding how to treat such functions. If true a fix to 2606 would circumvent such handling. Should we instead add a new
semantic to EngineBuilder which would configure the JIT, or whatever else, to handle such cross module linkage, if such behavior
is warranted? By linkage I mean mapping variables, and functions found in “external” modules, into the execution engine. For example,
one could devise a new function pass style metaphor which would be setup in the EngineBuilder instance. If it exists, the resulting
JIT instance would use this new pass style to determine what to do with “foreign functions, and variables”. However the first question
is, whether or not 2606 is a bug in the first place?

Thanks again for any time spent on this

Garrison

PR2606.proposal.patch (6.8 KB)

Hi Garrison,

I finally come back from holidays and take time to watch your patch.

I must say that I largely prefer this version over the previous one ! I like the reuse of getLazyFunctionStub, but I don’t know if the forceEmitFunctionStub is still needed ?

I thought about JIT and modules, and I wonder if we don’t need to take it another way.
Now we can create multiples JIT. What if we decide to simplify the JIT in allowing only one module per instance of JIT ? It will simplify greatly the code of the JIT.

If we need to jit 2 differents modules, we can create 2 JITs. If one module needs another module, we can create a LazyFunctionCreator (called by JIT::getPointerToNamedFunction) and easily resolved cross-module references.

What do you think ? Anyone have an opinion on this ?

Olivier.

Hi Garrison,

I finally come back from holidays and take time to watch your patch.

I must say that I largely prefer this version over the previous one ! I like
the reuse of getLazyFunctionStub, but I don't know if the
forceEmitFunctionStub is still needed ?

I thought about JIT and modules, and I wonder if we don't need to take it
another way.
Now we can create multiples JIT. What if we decide to simplify the JIT in
allowing only one module per instance of JIT ? It will simplify greatly the
code of the JIT.

If we need to jit 2 differents modules, we can create 2 JITs. If one module
needs another module, we can create a LazyFunctionCreator (called by
JIT::getPointerToNamedFunction) and easily resolved cross-module references.

The LazyFunctionCreator has to guarantee that it returns non-null for
existing functions or the JIT will abort. If you have recursion across
modules, and the LazyFunctionCreator tries to compile them, that'll
either fail or try to compile recursively, which also fails an
assertion. So I think the single JIT dealing with multiple modules
will work better.

Hi Olivier,

Hi Garrison,

I finally come back from holidays and take time to watch your patch.

I must say that I largely prefer this version over the previous one ! I like the reuse of getLazyFunctionStub, but I don’t know if the forceEmitFunctionStub is still needed ?

JIT::forceEmitFunctionStub(…) was created to bring its implementation into JITEmitter file scope as JITResolver and
therefore JITResolver::getLazyFunctionStub(…) are members of an anonymous namespace in that scope. However
I can instead use a pre-existing method: getPointerToFunctionOrStub(…) which is declared in ExecutionEngine
and overwritten in JIT (within JITEmitter file scope). Your response prompted my search. I’ve unit tested this and it works great.
There is one caveat though. The implementation of JIT::getPointerToFunctionOrStub(…) first invokes ExecutionEngine::getPointerToGlobalIfAvailable(…)
behavior to check whether or not the Function instance is already actualized. Subsequently JIT::getPointerToFunctionOrStub(…)
calls JITResolver::getLazyFunctionStub(…). Given that ExecutionEngine::getPointerToGlobalIfAvailable(…) iterates through all known
actualized functions, we would take a compile time hit here by using getPointerToFunctionOrStub(…) versus my forceEmitFunctionStub(…)
since we know at that point in the call stack that we are already dealing with a declaration. I don’t know if this is a viable concern,
but if not I’d be happy to change the 2606 working patch to reflect use of getPointerToFunctionOrStub(…) in place of forceEmitFunctionStub(…).
On the pro side, using getPointerToFunctionOrStub(…) does get rid of introducing a new method (forceEmitFunctionStub(…)), which will not
have the same future design support as getPointerToFunctionOrStub(…). We could also modify ExecutionEngine::getPointerToGlobalIfAvailable(…) to
avoid the search if the function is a declaration. I personally don’t care for this approach.

I thought about JIT and modules, and I wonder if we don’t need to take it another way.
Now we can create multiples JIT. What if we decide to simplify the JIT in allowing only one module per instance of JIT ? It will simplify greatly the code of the JIT.

If we need to jit 2 differents modules, we can create 2 JITs. If one module needs another module, we can create a LazyFunctionCreator (called by JIT::getPointerToNamedFunction) and easily resolved cross-module references.

I can’t comment on this as I have not dealt with multiple JITs as you and Jeffrey have (for those reading this thread after the fact
see Jeffrey’s follow up response).

Like you though I am favoring a solution where the factory nature of EngineBuilder is leveraged. In my view we should be able
to create a desired ExecutionEngine via this pattern. In reality though, there are two existing factory semantics used for the
construction of ExecutionEngines. One is the one given by the API in EngineBuilder, where the type of ExecutionEngine is
determined by the a priori EngineBuilder::Kind. The other is contained by the manual header file include mechanism; including
llvm/include/ExecutionEngine/JIT.h and or llvm/include/ExecutionEngine/ExecutionEngine.h. Currently both of these methods
are used in conjunction to determine the final outcome. If however one wanted to include a derived class of JIT to optionally bring
into being my behavioral fix for 2606, one would have to deal with these methods. I personally would create a new inlined virtual
declaration in JIT, and use it in JIT::getPointerToFunction(…), in place of my 2606 mod, to effect the new cross module functionality
in this new JIT derived class instance. A simple hack resolution of “resolve these methods” implies leveraging another include file
to set ExecutionEngine::JITCtor while simultaneously handling JIT.cpp’s static setting of this same variable. A better solution would
be to redesign EngineBuilder to use say dlopen/dlload functionality to dynamically pull in the proper JIT derived class library.
Anyway this is my 2 cents.

What do you think ? Anyone have an opinion on this ?

Olivier.

Garrison

[sidenote: Please try to avoid extraneous whitespace and line wrapping changes in your patches. It makes it harder to see what you’re actually changing]

Hi Jeffrey,

[sidenote: Please try to avoid extraneous whitespace and line wrapping changes in your patches. It makes it harder to see what you’re actually changing]

Sorry just saw some preexisting code was not in 80 columns.

Hi Jeffrey,

[sidenote: Please try to avoid extraneous whitespace and line wrapping
changes in your patches. It makes it harder to see what you're actually
changing]

Sorry just saw some preexisting code was not in 80 columns.

Hi Olivier,

Hi Garrison,

I finally come back from holidays and take time to watch your patch.

I must say that I largely prefer this version over the previous one ! I
like the reuse of getLazyFunctionStub, but I don't know if the
forceEmitFunctionStub is still needed ?

JIT::forceEmitFunctionStub(...) was created to bring its implementation
into JITEmitter file scope as JITResolver and
therefore JITResolver::getLazyFunctionStub(...) are members of an
anonymous namespace in that scope. However
I can instead use a pre-existing method: getPointerToFunctionOrStub(...)
which is declared in ExecutionEngine
and overwritten in JIT (within JITEmitter file scope). ...

Have you tried the existing pending functions mechanism? I don't want to
introduce stubs in eager compilation mode, and I don't think you have to.

I'm not sure I understand as forceEmitFunctionStub(...) merely
calls getLazyFunctionStub(...) which uses the pending function mechanism.
The
stubs are replaced at compile (JIT) time not at runtime. Is this not how
call sites are currently treated? Is this what you are referring to?

Ah, I was confused. You're right, getLazyFunctionStub is the
badly-named function you want.

You currently have nearly the same loop in getOrEmitGlobalVariable()
and getPointerToFunction(). Can you unify those, perhaps by making
them searches for GlobalValues with getNamedValue()?

GetLinkageResult() in lib/Linker/LinkModules.cpp has a lot of logic
about choosing a function based on its linkage. You should read that
to figure out how to filter GVs with various linkages in your search.

Please add some tests for cross-module JITting. You'll find some of
the functions defined in unittests/ExecutionEngine/JIT/JITTest.cpp
helpful, but I think this belongs in a separate file. Just copy the
useful functions, and at some point I'll build an actual library for
the helper functions.

Major mods:
JIT.{h,cpp}:
2) JIT::getPointerToFunction(...) was modified to search for a non-mapped
function in all "other" modules.
If found a stub is emitted in the same manner as other in module functions
found. This stub gets re-emitted,
toward the end of the JIT cycle.
One issue here is that functions that are external to all modules, have
their mappings delayed by the
above loop.

I suspect you can pull the decision of how to emit a stub back into
the JITEmitter. If you add a
"Function*findDefiningFunction(Function*)" method to the JIT, which
searches the Modules for the strongest definition of a same-named
function, you can keep the logic for how to deal with compiling that
function in the JITEmitter.

I also kept the hasAvailableExternallyLinkage() the same as
before, as I'm not sure if this
should also go through the same module procedure.

If something's available_externally in the current module and external
in another module, I think we should compile the external definition.
If it's available_externally in all modules, we need to look with
dlsym(). Please add a test for that.

The stub emission will
only take place for function
definitions that are marked with external linkage. Should other cases be
allowed? Should visibility also
be considered?

Yes, I think you should do the same thing LinkModules.cpp does when
it's possible.

3) JIT::getOrEmitGlobalVariable(...) was modified to search for non-mapped
global variables in all "other"
modules. If found, the address of the foreign global variable is requested,
and mapped. There is no
delay here.

Could you add a test in which M1::F1 uses M2::GVar1, which contains in
its initializer the address of M1::F2?

An attempt is first made to find the global outside of all modules, before
attempting to find the function
in the known modules. This is the reverse of logic used in #2.

That doesn't seem right.

On searching,
unlike for functions, a linkage
check was not made. Also the global variable declaration and definition must
have the exact same type.
Is this the correct approach?

They may not have exactly the same type because of opaque types in
each module. We probably shouldn't go to the trouble of unifying the
types between the modules, so I'd be inclined to just ignore the
types. Or maybe only pay attention when they're concrete types?

4) The declaration of void* forceEmitFunctionStub(Function *F) was added
to the class JIT. This is
implemented in JITEmitter.cpp.

This isn't quite the right name, but I think you can eliminate it
entirely (see above).

Beyond any issues with the patch, there is a question, in my mind, as to
whether 2606 is really a bug.

I suspect it is.

Sure its resolution makes
using the JIT simpler for cross module behavior, but current manual
solutions may be more fined grained in their approach in
deciding how to treat such functions. If true a fix to 2606 would circumvent
such handling.

If I understand correctly, the current manual solution is to figure
out what functions the JIT is going to want from another module, and
compile them first. That would continue to work even after the JIT
links modules for you.

Should we instead add a new
semantic to EngineBuilder which would configure the JIT, or whatever else,
to handle such cross module linkage, if such behavior
is warranted? By linkage I mean mapping variables, and functions found in
"external" modules, into the execution engine. For example,
one could devise a new function pass style metaphor which would be setup in
the EngineBuilder instance. If it exists, the resulting
JIT instance would use this new pass style to determine what to do with
"foreign functions, and variables".

This strikes me as overkill until someone actually needs it.

Thanks,
Jeffrey

Hi Jeffrey,

I'm still in the process of learning the unittest technology. While I did get distracted with
test-suite and test, and therefore the requisite installs of llvm-gcc, and DejaGNU, I think
I'm back on track now. So that was a waste of time. By the way the reference I have to the
unittests target was in an archive for this list. Is there no llvm level doc for this? Regardless
the gtest.h is fairly easy to understand. It just seems like one can get distracted by the
lack of llvm doc. on unittests when a corresponding doc exists for the testing infrastructure.
The developers guide did not seem to contain anything either. If there is such a doc, could
you give me a link?

You currently have nearly the same loop in getOrEmitGlobalVariable()
and getPointerToFunction(). Can you unify those, perhaps by making
them searches for GlobalValues with getNamedValue()?

GetLinkageResult() in lib/Linker/LinkModules.cpp has a lot of logic
about choosing a function based on its linkage. You should read that
to figure out how to filter GVs with various linkages in your search.

When I'm able to write a unit test, I'll re-submit a patch which will contain
a JIT function to handle GlobalValues (both GlobalVariables and Functions).
I did this for that CrossModuleJIT experiment that I posted in another thread.
Anyway I'll start from there, and then proceed to using LinkModules.cpp logic.

[snip]

Please add some tests for cross-module JITting. You'll find some of
the functions defined in unittests/ExecutionEngine/JIT/JITTest.cpp
helpful, but I think this belongs in a separate file. Just copy the
useful functions, and at some point I'll build an actual library for
the helper functions.

Working on this now.

[snip]

I suspect you can pull the decision of how to emit a stub back into
the JITEmitter. If you add a
"Function*findDefiningFunction(Function*)" method to the JIT, which
searches the Modules for the strongest definition of a same-named
function, you can keep the logic for how to deal with compiling that
function in the JITEmitter.

Will do

I also kept the hasAvailableExternallyLinkage() the same as
before, as I'm not sure if this
should also go through the same module procedure.

If something's available_externally in the current module and external
in another module, I think we should compile the external definition.
If it's available_externally in all modules, we need to look with
dlsym(). Please add a test for that.

Will do

The stub emission will
only take place for function
definitions that are marked with external linkage. Should other cases be
allowed? Should visibility also
be considered?

Yes, I think you should do the same thing LinkModules.cpp does when
it's possible.

Yes I'll create one function to treat all GlobalValues and then reuse LinkModules
logic. I've yet to look at LinkModules.cpp

[snip]

Could you add a test in which M1::F1 uses M2::GVar1, which contains in
its initializer the address of M1::F2?

Will do

An attempt is first made to find the global outside of all modules, before
attempting to find the function
in the known modules. This is the reverse of logic used in #2.

That doesn't seem right.

Ok I'll check on my understanding. On thinking about it, I think I see how I'm
mistaken.

On searching,
unlike for functions, a linkage
check was not made. Also the global variable declaration and definition must
have the exact same type.
Is this the correct approach?

They may not have exactly the same type because of opaque types in
each module. We probably shouldn't go to the trouble of unifying the
types between the modules, so I'd be inclined to just ignore the
types. Or maybe only pay attention when they're concrete types?

Ok, I'll modify accordingly if I understand how.

4) The declaration of void* forceEmitFunctionStub(Function *F) was added
to the class JIT. This is
implemented in JITEmitter.cpp.

This isn't quite the right name, but I think you can eliminate it
entirely (see above).

Yup

Beyond any issues with the patch, there is a question, in my mind, as to
whether 2606 is really a bug.

I suspect it is.

Cool, then we are moving forward.

Sure its resolution makes
using the JIT simpler for cross module behavior, but current manual
solutions may be more fined grained in their approach in
deciding how to treat such functions. If true a fix to 2606 would circumvent
such handling.

If I understand correctly, the current manual solution is to figure
out what functions the JIT is going to want from another module, and
compile them first. That would continue to work even after the JIT
links modules for you.

Yes, in re-thinking this, I believe you are correct. I'm not sure about
all the use cases though, but you have much more knowledge on this
than I do. So I'll ignore and proceed.

Should we instead add a new
semantic to EngineBuilder which would configure the JIT, or whatever else,
to handle such cross module linkage, if such behavior
is warranted? By linkage I mean mapping variables, and functions found in
"external" modules, into the execution engine. For example,
one could devise a new function pass style metaphor which would be setup in
the EngineBuilder instance. If it exists, the resulting
JIT instance would use this new pass style to determine what to do with
"foreign functions, and variables".

This strikes me as overkill until someone actually needs it.

Yes, most definitely

Thanks,
Jeffrey

Thanks again. By the way if I'm too slow on this, don't hesitate to jump in. I have to
learn this process anyway, and this bug is a good way to do it. So I'll proceed in my
own private space even if you, Oliver, or someone else shotguns ahead.

I do have one question though. After having done the fix and the appropriate unit tests,
does one still need to add to the tests suite for the testing bots, and add to test for make
check (for this kind of bug fix)? Do the unittests somehow get invoked by make check? I
guess I don't really understand the relation between the tests exercised by the unittests
target, and the tests exercised by the check target. Maybe there isn't any.

Garrison

The unit tests do get run by `make check-lit` and by the build bots. I
suspect they're not run by `make check`, but check-lit is faster
anyway. A unit test is sufficient for this change (and maybe all you
can possibly write given that it requires multiple modules in the same
JIT, which lli isn't set up to do.)

Sorry they aren't better documented.

Hi Jeffrey,

The attached patch contains what I guess are the first round of unit tests for this effort, which mostly came from
existing attachments to 2606. I’ve left the current solution to 2606 pretty much the same as before. It can be viewed
as a place holder for incorporating in LinkModule.cpp type logic in the next patch. Most likely the inclusion of this logic
will require a further rearrangement of the real solution across JIT and JITEmitter as compared to what is showed.

Three unit tests are currently provided in a new MultiModuleJITTest.cpp, and are more complicated versions of the
following:

  1. M1::F1 → M2::F2
  2. M1::F1 → M2::G
  3. M1::F1 → M1::F2 via M2::G holding pointer to M1::F2

where M::Fa ==> function a in module M
→ ==> calls
M::G ==> global variable G in module M

The artificial solution is not documented and is marked with GMV Mod: as before. Beyond the lack of a real LinkModule
type solution, I still have not fully provided what your previous email discussed, but I believe we will slowly approach
this limit with iteration. Also as you can see, I tend to use more formatting then most. So if you find this uncomfortable,
I will flatten where needed.

Should I add this patch as an attachment to bug 2606?

Thanks for looking over this.

Garrison

PS: It will be a little while before I get back to this effort (3 days or so I believe), which unless redirected otherwise, will be
focussed on replacing the artificial solution with similar logic contained in LinkModule::GetLinkageResult(…).

PR2606.preliminary_with_unittests.patch (17.3 KB)