Orc JIT vs. STL

Greetings, LLVM wizards.

We are using Clang and Orc JIT (v1) to compile and execute C++ code on the fly. If a C++ module calls functions from external libraries, we add them via DynamicLibrary::LoadLibraryPermanently().

The problem we have run into recently is when a module calls a function from the STL – in particular this swap() function for input streams:

#include

std::ifstream stream1, stream2;
stream1.swap(stream2);

When we run the constructors for the module, we get two undefined symbols. And explicitly adding libstdc++ doesn’t help. It turns out that the missing symbols are defined not in the runtime DSO but in an archive file:

/opt/rh/devtoolset-6/root/usr/lib/gcc/x86_64-redhat-linux/6.3.1/libstdc++.a

So my questions are:

  1. Is there a simple way to get access to all symbols defined in the STL? Intuitively, it seems like we should not need to know about such compiler magic.

  2. If there is no magical solution, is there a way to explicitly add symbols from an archive?

Thanks,

Geoff

HI
Did you run the static constructor and destructor? How did you make your process symbols visible to ORC jit?
Could you please share us the for what symbols you get undefined references :slight_smile:

HI
Did you run the static constructor and destructor? How did you make your process symbols visible to ORC jit?

Yes. It's the constructor that generates the undefined symbol error.
We use DynamicLibrary::LoadLibraryPermanently(nullptr) to add process
symbols.

Could you please share us the for what symbols you get undefined references :slight_smile:

Certainly! Mangled:

    _ZNSi4swapERSi
    _ZNSt13basic_filebufIcSt11char_traitsIcEE4swapERS2_

And unmangled:

    std::basic_istream<char, std::char_traits<char>

::swap(std::basic_istream<char, std::char_traits<char> >&)

    std::basic_filebuf<char, std::char_traits<char>

::swap(std::basic_filebuf<char, std::char_traits<char> >&)

Incidentally, if I call that STL swap() function in the application,
to ensure it is in the process symbols, the second symbol is found,
but the first is still undefined.

Hi Geoff,
I tried it, but I can’t able to reproduce it.

Test Program:
#include
int main()
{
std::ifstream stream1, stream2;
stream1.swap(stream2);
return 0;
}

I didn’t get undefined symbols error. I used DynamicLibrarySearchGenerator::GetForCurrentProcess API to make symbols from STL visible to ORC JIT.

You can add symbols from Archieve via StaticLibrarySearchGenerator. But it is added recently though

I don't have the DynamicLibrarySearchGenerator and
StaticLibrarySearchGenerator classes. I should mention that we are
using LLVM 7... Does
DynamicLibrarySearchGenerator::GetForCurrentProcess() do more than
what DynamicLibrary::LoadLibraryPermanently(nullptr) does (or did)?

I am also thinking our problems might be connected to the fact that
gcc 6.3.1 is in a software collection (devtoolset-6) and that there is
some sort of magic going on to mix its STL (the .a file) with the
libstdc++ DSO, which is in /usr/lib64, not in devtoolset-6.

AFAIK, GetForCurrentProcess method uses getPermanentLibrary rather than LoadLibraryPermanently.

[I am reposting this with a different title and other changes, because I am fairly confident our problem is related to Red Hat’s developer toolset.]

Greetings, LLVM wizards.

We have an application that uses Clang and Orc JIT (v1) to compile and execute C++ code on the fly. If a C++ module calls functions from external libraries, we add them via DynamicLibrary::LoadLibraryPermanently().

Recently we moved to gcc 6.3.1 to build our application, using Red Hat’s devtoolset-6 on CentOS, and this seems to create problems when a module calls functions from the STL. If a module calls this swap() function for input streams, for example:

#include
std::ifstream stream1, stream2;
stream1.swap(stream2);

When we run the constructors for the module, we get undefined STL symbols. It turns out that the missing symbols are defined not in the runtime DSO (in /usr/lib64) but in an archive file installed with the developer toolset:

/opt/rh/devtoolset-6/root/usr/lib/gcc/x86_64-redhat-linux/6.3.1/libstdc++.a

Apparently the linker performs some magic allowing an application compiled and linked with gcc 6.3.1 to run on a system with an older version of the STL (from gcc 4.8.5), augmenting the old DSO with stuff that is linked in statically.

SO my question is: is there any way to reproduce that magical behavior in an Orc JIT compiler, so that code can link properly with the STL?

Thanks,
Geoff

The libstdc++.so file that comes with devtoolset-6 is actually not a DSO but a small text file containing this:

/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
INPUT ( /usr/lib64/libstdc++.so.6 -lstdc++_nonshared )

So the question, I think, is: what exactly does the GNU linker do with this information, and is there a way to replicate its behavior in Orc JIT?

This is a fairly simple linker script that tells the linker to explicitly link /usr/lib64/libstdc++.so.6 and do the equivalent of adding -lstdc++_nonshared to the command line. It sounds as if this is done to enable long-term ABI stability, so anything that is guaranteed to be safe over a long-term release cycle goes in /usr/lib64/libstdc++.so.6 and everything else goes in {some compiler-release specific path}/libstdc++_nonshared.a.

How to solve this depends a bit on whether you want to solve the general case of this or whether you want to special case libstdc++ on RedHat systems. If it's the latter, then the simplest thing to do is link libstdc++_nonshared.a with the --whole-archive flag, to pull all of the symbols from the .a into the main binary. This will let the JIT find them in the main binary and everything should just work.

If you want to solve the general case, then whatever you are using to load .so files needs to learn how to parse linker scripts. There's logic in LLD for doing this, but I don't believe anyone has implemented it in ORC. You'd then need to load all of the things that are referenced by the linker script. You'd probably only need to handle a fairly small subset of the things in linker scripts, because anything beyond 'link these other objects' is unlikely to appear in a .so linker script.

David