ORC JIT - Can modules independently managed with one LLJIT instance? + problems with ExecutionSession.lookup

Hi Lang,

Thank you for your answer! This helped me again a lot!! Also that ResourceTracker is a really neat feature! Looking forward to it! :3

I changed the title cause… there is another issue I have (sorry about that…)

I’m finally allowed to investigate the ORC JIT for integration into our system, which meant I got a few days to actually play around with it. However, another problem arise which breaks my concept. It is the never ending story of “cross references”. I have again two Modules, which are added to two different LLJIT instances, but they are referencing each other. In the past you suggested me to use the LookUp function of the ExecutionSession to get the addresses – so I wrote this:

bool ModuleLoader :: resolve()

{

auto &ES = this->jit->getExecutionSession();

SymbolLookupSet lookupSet;

this->undefinedReferences.clear();

for(const auto &element : this->symbolsOfInterrest)

{

lookupSet.add(element.second.name,llvm::orc::SymbolLookupFlags::WeaklyReferencedSymbol);

}

auto result = ES.lookup({{&jit->getMainJITDylib(), llvm::orc::JITDylibLookupFlags::MatchAllSymbols}}, lookupSet, llvm::orc::LookupKind::Static, llvm::orc::SymbolState::Resolved);

if(result)

{

for(const auto &element : *result)

{

const llvm::StringRef &name = (*element.first);

const size_t hash = calculateHash(name);

printf(“>>>%s @ 0x%p\n”, name.data(), element.second.getAddress());

this->symbolsOfInterrest[hash].adr = element.second.getAddress();

}

}

else

{

ES.reportError(result.takeError());

}

this->mtx.unlock();

return (this->undefinedReferences.size() == 0ull);

}

I also attached a reporter to the ES which will handle “llvm::orc::SymbolsNotFound” by copying SymbolVector to the “undefinedReferences”. If I call this function and have every symbol resolved, then I can use the addresses to actually execute them. That is great! However, when I have an undefined Reference, things get strange… The first call will trigger my “tryToGenerate” function but it will not be able to resolve a certain symbol. The reporter will be triggered and the “undefinedReferences” SymbolVector will have size 1.
When I call the function a second time however, the “tryToGenerate” function will not be called anymore, so my vector will be empty, the undefined reference is still not resolved, but I return with a true and crash my program. So even if I would have an address for that one symbol in the second run, I would have no chance to tell “anyone” cause the “tryToGenerate” function was never called… Said function looks like that though:

llvm::Error ReferenceManager::UndefinedReferenceResolver :: tryToGenerate(llvm::orc::LookupKind K, llvm::orc::JITDylib &JD, llvm::orc::JITDylibLookupFlags JDLookupFlags, const llvm::orc::SymbolLookupSet &LookupSet)

{

llvm::orc::SymbolNameVector notFound;

llvm::orc::SymbolMap NewSymbols;

for(const auto &name : LookupSet)

{

printf(“Generate!\n”);

const uintptr_t adr = UndefinedReferenceResolver::lookup((*name.first).data());

if(adr)

{

NewSymbols[name.first] = llvm::JITEvaluatedSymbol(adr, llvm::JITSymbolFlags::Absolute);

}

else

{

notFound.push_back(name.first);

}

}

JD.define(absoluteSymbols(std::move(NewSymbols)));

return (notFound.size() == 0) ? llvm::Error::success() : llvm::make_errorllvm::orc::SymbolsNotFound(std::move(notFound));

}

When I use “llvm::orc::JITDylibLookupFlags::MatchExportedSymbolsOnly” then I will get 1 undefined Reference in the first run, but a total of 9 in the second run, because every symbol I wanted to lookup was now an undefined reference.

Thank you for the help in advance!

Kind greetings

Björn

Hi Bjoern,

If you had the removable code feature could you merge your LLJIT instances and just have one instance with multiple JITDylibs? That will make your life much easier. Interdependencies between modules in different ExecutionSessions are dangerous: Dependencies are not tracked, and it will be easy for everything to look as if it’s working, but fail with thread scheduling bugs. E.g. Thread 1 assigns an address to Symbol 1 but yields before setting the memory permissions. Thread 2 is scheduled and sees that Symbol 1 has been resolved, so proceeds to link Symbol 2 against it, set Symbol 2’s memory permissions, then jump to it. The program will crash when execution of JIT’d code reaches symbol 1.

To be clear: Cross JIT references are ok as long as the dependencies form a DAG. If there are any cycles then you’re in trouble.

I’d strongly recommend switching to one LLJIT instance and using removable JITDylibs to solve this.

If you want to continue with multiple LLJIT instances my comments/recommendations would be:
(1) Don’t use error handling to track the unresolved symbols – there are places in ORC where we only report the first missing symbol rather than all of them.
(2) You shouldn’t issue a call to ExecutionSession::lookup from inside tryToGenerate: The tryToGenerate method is called under the session lock, but ExecutionSession::lookup should only be called outside the lock. Instead of issuing the lookup in place you should write a custom MaterializationUnit and issue the lookup from its materialize method.

– Lang.

Hey Lang,

I would be really happy to only have one LLJIT instance and using multiple JITDylibs. However… it seems like that I don’t know enough to use them. So I wonder…

Hi Bjoern,

I would be really happy to only have one LLJIT instance and using multiple JITDylibs. However… it seems like that I don’t know enough to use them. So I wonder…

  1. When I add Module A to JITDylib A and Module B to JITDylib B – where will those look for undefined symbols? Will Module A for example: will it only search itself and the MainDylib? Or would it also search in JITDylib B?

Searches follow the containing JITDylib’s Link Order. E.g.

auto &LibA = ExitOnErr(LLJIT.getExecutionSession().createJITDylib(“LibA”));
auto &LibB = ExitOnErr(LLJIT.getExecutionSession().createJITDylib(“LibB”));
LibA.addToLinkOrder(LibB); // Add LibB to LibA’s search order.

Code added to LibA will resolve external symbols by first looking in LibA, then in LibB.

In general external references are resolved by searching each element of the containing JITDylib’s link order. Each JITDylib appears at the start of its own search order by default when you create it.

  1. If my current approach with using tryToGenerate and ES.lookUp is not correct, how would I do this then? Our cross references don’t have the exact symbol names where they want to be resolved to. Our modules are loaded into a tree structure so you can use the symbol names to navigate that tree. This is why I don’t want the LLJIT to do an automatic resolving of those symbols, because it can’t know about our structure. However, how could I achieve this then? Especially if it is not so straight forward to find the symbols that are not resolved yet.
    For example: Module A might reference Planschi_test Module B was loaded with name “Planschi” and has a variable called test – so our old loader would then resolve that “Planschi_test” reference with that address.

I can think of a few ways to approach this:

A. Rename variables/functions at the IR level or higher. E.g. Rename ‘test’ to ‘Planschi_test’ in IR. This is probably the simplest scheme if your use-case allows.

B. Use a DefinitionGenerator and the re-exports utility: When you see an external reference to M_N you would create a re-export of N → (JITDylibForM, N). This is ok, but you’ll need to manage extra JITDylibs to ensure that no two JITDylibs end up containing a duplicate definition (e.g. ‘test’).

C. Rename variables/functions at the JITLink layer. This will only be available once there is a JITLink implementation for your platform. It’s like option (1), but means that the mangling isn’t visible in objects dumped to disk. This could be useful if you want to re-use the objects in other contexts where the original names are relevant (e.g. linking on the command line).

  1. Is there a way then, how I can lookup the addresses of the Module A, while still having undefined references, but I’m allowed to provide those at a later point?

Definition generators are the solution to this. You can get the address of symbols defined by Module A while A still contains undefined external references, however definitions for those undefined external references must be supplied by a definition generator or Module A will immediately fail linking.

– Lang.

Hi Lang,

I think I start to get an idea how we could use LLVM in our system… so I try repeating what I understood:

Searches follow the containing JITDylib’s Link Order. E.g.

auto &LibA = ExitOnErr(LLJIT.getExecutionSession().createJITDylib(“LibA”));

auto &LibB = ExitOnErr(LLJIT.getExecutionSession().createJITDylib(“LibB”));

LibA.addToLinkOrder(LibB); // Add LibB to LibA’s search order.

That means – when I don’t add LibB to LibA, that this one will not be searched right?
However… when I add LibB to LibA and now I add LibA to LibC – will LibC also search LibB?

LibA.addToLinkOrder(LibB);

LibC.addToLinkOrder(LibA);