How to do bitcode archive linking correctly?

Hi,

We're currently upgrading KLEE to work with LLVM >=3.3 and we've hit a problem.

It seems r172749 removed support for linking a bitcode archive into a
module. KLEE unfortunately depends on this to link in its runtime (
which amongst other things provides a C library [5] ).

A first attempt at linking in a bitcode archive ourselves can be seen in [1].

This approach does not work correctly, it's incredibly slow and I'm
not convinced it will resolve all undefined or weak symbols like
linking with an archive normally should. This leads me to ask

* How does Linker::LinkModules() [2] behave? Does it just blindly
concatenate modules ( that might explain the slow down ) or does it
actually check the symbol tables of the modules and only link in `Src`
if `Dest` has unresolved or weak symbols that `Src` can provide?

* If Linker::LinkModules() does decide to merge `Src` into `Dest` does
it merge the entire contents of the `Src` module or just the symbols
that are needed by `Dest`?

* IIRC when linking archives a linker must must link constituent
modules multiple times until a fixed point is reach (the set of
undefined symbols does not change ). Does that apply here? I guess the
answer depends on the behaviour of Linker:: LinkModules(). If it does
that would make [1] incorrect as it only links in the archive's
modules once.

* Why is the approach so slow compared to using the old
Linker::linkInFile() API in prior LLVM versions?

What ever the answers are to the above, the approach in [1] isn't
right. So what is the right way to programmatically link in LLVM
bitcode archives into a bitcode module?

I noted that the r172749 commit says...

the "right" way to get this support is to use the
platform-specific linker-integrated LTO mechanisms, or the forthcoming
LLVM linker.

I'm not sure how this helps us

- For "platform-specific linker-integrated LTO mechanisms", is that
referring to [3]? If so that's not particularly helpful as I'd like an
API I can use to do the archive linking rather than having to invoke
an external program (i.e. my system's linker).
- For "forthcoming LLVM linker", is that referring to lld [4]?

[1] klee/ModuleUtil.cpp at 51d4a2b34511a8d6b00b16a50f90b0bc1a793a69 · MartinNowack/klee · GitHub
[2] http://llvm.org/docs/doxygen/html/classllvm_1_1Linker.html#a244da8c7e9789b1b675b9741bd692c63
[3] http://llvm.org/docs/GoldPlugin.html
[4] http://lld.llvm.org/
[5] https://github.com/ccadar/klee-uclibc/tree/klee_0_9_29

Thanks,
Dan Liew.

Hi,

We're currently upgrading KLEE to work with LLVM >=3.3 and we've hit a problem.

It seems r172749 removed support for linking a bitcode archive into a
module. KLEE unfortunately depends on this to link in its runtime (
which amongst other things provides a C library [5] ).

A first attempt at linking in a bitcode archive ourselves can be seen in [1].

This approach does not work correctly, it's incredibly slow and I'm
not convinced it will resolve all undefined or weak symbols like
linking with an archive normally should. This leads me to ask

* How does Linker::LinkModules() [2] behave? Does it just blindly
concatenate modules ( that might explain the slow down ) or does it
actually check the symbol tables of the modules and only link in `Src`
if `Dest` has unresolved or weak symbols that `Src` can provide?

* If Linker::LinkModules() does decide to merge `Src` into `Dest` does
it merge the entire contents of the `Src` module or just the symbols
that are needed by `Dest`?

* IIRC when linking archives a linker must must link constituent
modules multiple times until a fixed point is reach (the set of
undefined symbols does not change ). Does that apply here? I guess the
answer depends on the behaviour of Linker:: LinkModules(). If it does
that would make [1] incorrect as it only links in the archive's
modules once.

* Why is the approach so slow compared to using the old
Linker::linkInFile() API in prior LLVM versions?

What ever the answers are to the above, the approach in [1] isn't
right. So what is the right way to programmatically link in LLVM
bitcode archives into a bitcode module?

What we are using in Mesa is pretty similar to what you are doing in
[1]. Take a look at this mailing list post:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/059448.html
and also the change we made in Mesa as a result of this commit:
http://cgit.freedesktop.org/mesa/mesa/commit/src/gallium/state_trackers/clover/llvm/invocation.cpp?id=aa1c734b3ca445b5af743b9bad6a48ca7ba21f3c
for some background on this linker change.

For the implementation details of Linker::LinkModules(), you will
probably need to look through the code to figure out what it is doing.

-Tom

Hi,

We're currently upgrading KLEE to work with LLVM >=3.3 and we've hit a problem.

It seems r172749 removed support for linking a bitcode archive into a
module. KLEE unfortunately depends on this to link in its runtime (
which amongst other things provides a C library [5] ).

A first attempt at linking in a bitcode archive ourselves can be seen in [1].

This approach does not work correctly, it's incredibly slow and I'm
not convinced it will resolve all undefined or weak symbols like
linking with an archive normally should. This leads me to ask

* How does Linker::LinkModules() [2] behave? Does it just blindly
concatenate modules ( that might explain the slow down ) or does it
actually check the symbol tables of the modules and only link in `Src`
  if `Dest` has unresolved or weak symbols that `Src` can provide?

* If Linker::LinkModules() does decide to merge `Src` into `Dest` does
it merge the entire contents of the `Src` module or just the symbols
that are needed by `Dest`?

* IIRC when linking archives a linker must must link constituent
modules multiple times until a fixed point is reach (the set of
undefined symbols does not change ). Does that apply here? I guess the
answer depends on the behaviour of Linker:: LinkModules(). If it does
that would make [1] incorrect as it only links in the archive's
modules once.

* Why is the approach so slow compared to using the old
Linker::linkInFile() API in prior LLVM versions?

What ever the answers are to the above, the approach in [1] isn't
right. So what is the right way to programmatically link in LLVM
bitcode archives into a bitcode module?

What we are using in Mesa is pretty similar to what you are doing in
[1]. Take a look at this mailing list post:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-February/059448.html
and also the change we made in Mesa as a result of this commit:
mesa/mesa - The Mesa 3D Graphics Library (mirrored from https://gitlab.freedesktop.org/mesa/mesa)
for some background on this linker change.

Thanks Tom. Unfortunately we are trying to link in a bitcode archive rather than just a single module so your solution to the LLVM API change won't help here. Any other ideas?

For the implementation details of Linker::LinkModules(), you will
probably need to look through the code to figure out what it is doing.

Looking at the code I think it merges everything from `Src` into `Dest`.

Hi,

Daniel & Eli. Would you be able to comment on the right approach to the problem discussed below seeing as it appears you both worked on the commits that removed the ability to link in bitcode archives?

I'd be especially keen to here your opinion Daniel as you are the original author of KLEE so you probably have a good idea what would be a suitable replacement (invoking gold with the LLVM LTO plug-in via exec() does not sound very appealing).