Choosing the unwinder / C++ low-level libraries in Clang

Folks,

We have an option to choose the compiler library called --rtlib which
chooses between libgcc and compiler-rt.

The problem with that is which unwinder / C++ library goes along
compiler library? The main options today are libgcc_s, libgcc_eh,
libunwind (2 versions) and libc++abi. There are probably more.

One could want to test any variation, and for that, (s)he should
manually include them using the -l option, but there could be a
simpler way to add the basic combos. The main ones:

GCC: gcc_s and gcc_eh
LLVM: libunwind and libc++abi
Windows: ??
Savannah: libunwind(2) and ??

There may be more, but for now, let's start with these two.

I will de-couple compiler-rt with any of those combos by removing the
automatic include of ligcc_* with compiler-rt (a poor choice of mine),
but I'm now stuck on two things:

1. Name. I want it to be short and to look like --rtlib. So I'm
looking for a short name matching --rt[a-z]{1,4}. "rtunw" or "rtcpp"
would mislead users in thinking it's either one of the other.
"rtextra" means very little and is too long. "rtbase", "rtlow",
"llrt"?

2. Defaults. I think it makes sense to have gcc_* as default when
--rtlib=libgcc, but I'm not convinced that the LLVM counterparts
should be default for compiler-rt, especially on GNU systems. Maybe
the sanest options would be to set the default for GCC and not for
LLVM or Windows.

Opinions?

cheers,
--renato

What about --rtimpl? Defining the implementation of the low-level
functions used in the compiler run-time library?

cheers,
--renato

I’m not sure what problem you’re trying to solve here. These libraries tend to be tightly coupled to the platform, or possibly toolchain, not something that I’d want users to modify (unless they knew what they were doing enough to pass -nostdlib and provide the full set themselves).

On FreeBSD, we don’t explicitly link the C++ ABI library, because libc++.so is a linker script. We link compiler-rt, but our build of compiler-rt is called libgcc, for compatibility. On OS X, the libraries that implement these things are also quite specific. On Windows, they’ll depend a lot on whether you’re using a Cygwin, MinGW, or VS toolchain.

David

I’m not sure what problem you’re trying to solve here. These libraries tend to be tightly coupled to the platform, or possibly toolchain, not something that I’d want users to modify (unless they knew what they were doing enough to pass -nostdlib and provide the full set themselves).

The problem is that Compiler-RT doesn't include the low-level
implementation libraries, such as unwinders and C++ ABI, but libgcc
does via libgcc_s/eh. So, when using compiler-rt via --rtlib, that
functionality is missing, whereas it wasn't if you use libgcc. This is
inconsistent, as it claims to change the run time library, but doesn't
change all of it, and users are left scratching their heads.

The way I "fixed" this in the past is to automatically add libgcc_*
with compiler-rt if on GNU systems. This is horrible and must be
killed with fire.

On FreeBSD, we don’t explicitly link the C++ ABI library, because libc++.so is a linker script. We link compiler-rt, but our build of compiler-rt is called libgcc, for compatibility. On OS X, the libraries that implement these things are also quite specific. On Windows, they’ll depend a lot on whether you’re using a Cygwin, MinGW, or VS toolchain.

Exactly. This is why I'm changing ToolChain to have their own
defaults, which the user can override, without needing to go all the
way down to --nostdlib. It's not a full solution to all problems, but
it's a convenient way to test (and later make the default) to some of
the libraries that LLVM bundles together.

I imagine that for FreeBSD, even though you use libc++ and
compiler-rt, it will still be called libgcc*, so nothing changes for
you. The default is still GCC. But I don't think it's a good path for
everyone to keep symlinks, linker scripts and wrongly-named binaries
for all OSs.

The usage I expect to happen is:

1. Default behaviour, no rtlib no rtimpl. This would check the
environment, use ToolChain::GetDefaultRuntimeLibType and
ToolChain::GetDefaultRuntimeImplLibType to know what to include, and
do it. Most of the time, this will mean libgcc*.

2. Changing one, not the other. This would change the default
behaviour of the platform for the rtlib/rtimpl, but still use the
default runtime library/implementation. You use this when you want to
test/use only compiler-rt, or libunwind, but not the rest that is the
default in your platform.

4. Change both. I need this to test RT + unw + c++abi to validate it
on buildbots. Alternative Clang-based front-ends could pick and choose
their defaults just by overriding
GetDefaultRuntimeLibType/GetDefaultRuntimeImplLibType.

Advanced users, that want to use -nostdlib, can still have the full
freedom to pick and choose whatever they want. But for the end user,
being able to change the runtime library and its implementation with
one or two high-level parameters is a nice thing.

cheers,
--renato

Minor clarification, but libgcc also doesn’t include the C++ ABI stuff - that’s in libsupc++, which is usually statically linked to libstdc++. I think FreeBSD 9 is the only system to ship a separate libsupc++ shared library, though some ship a libsupc++.a for static linking of C++ programs that don’t use the standard library.

The Compiler-RT repository does include an unwinder, so perhaps this should be built as libcompilerrt_eh.so for platforms that want to use -rtlib=compilerrt?

David

Minor clarification, but libgcc also doesn’t include the C++ ABI stuff - that’s in libsupc++, which is usually statically linked to libstdc++. I think FreeBSD 9 is the only system to ship a separate libsupc++ shared library, though some ship a libsupc++.a for static linking of C++ programs that don’t use the standard library.

So, if I got it right:

             GCC LLVM
SoftLib: libgcc compiler-rt
Unwind: libgcc_s libunwind
EH ABI: libgcc_eh libc++abi
C++ABI: libsupc++ libc++abi
C++ Lib: libstdc++ libc++
Std LIBC: glibc --

I'm not seeing any mention to libsupc++ anywhere in Clang, which I
trust means it's not a division important enough to warrant worry.

LLVM's mode { libc++ / libc++abi / libunwind / compiler-rt } is
equivalent to GCC's { libstdc++ / libgcc* }. The only conflict that
could arise is when using libc++abi with libstdc++, but that depends
on how the functions are named. If memory serves me right, I had no
issue whatsoever when every time I used it.

The Compiler-RT repository does include an unwinder, so perhaps this should be built as libcompilerrt_eh.so for platforms that want to use -rtlib=compilerrt?

No, the unwinder is long gone to its own repository, because of a
cross-dependency with libc++abi.

cheers,
--renato

LLVM's mode { libc++ / libc++abi / libunwind / compiler-rt } is
equivalent to GCC's { libstdc++ / libgcc* }. The only conflict that
could arise is when using libc++abi with libstdc++, but that depends
on how the functions are named. If memory serves me right, I had no
issue whatsoever when every time I used it.

In the GNU world, libsupc++ is an implementation detail of libstdc++ for the most part. With libc++, it depends a lot on how it’s packaged - it will link to libsupc++, libc++abi, or libcxxrt either dynamically or statically. On FreeBSD, we ship libcxxrt.so.1 as a shared library (and libsupc++.so on FreeBSD 9) to make them plugable, and make libc++.so a linker script that links both libc++.so.1 and libcxxrt.so.1. On other platforms, libc++.{so,dylib} is often a symlink to libc++.so.1 that statically links libsupc++ or libc++abi (or, in some cases, dynamically links libstdc++.so to get its embedded libsupc++).

Libc++abi, libsupc++ and libcxxrt all provide the same symbols, so you should not mix them in a single program, though it is safe to use libc++ and libstdc++ in different libraries (that don’t expose C++ standard library types across the library boundary) in the same program.

On some GNUish platforms, the unwinder is in libgcc_s.so or libgcc_eh.a (for static linking). Mixing generic unwind implementations is not supported by the Itanium ABI, so you need to pick one, typically the one that the platform defines. This also needs to be used for C code for __attribute__((cleanup)) to work (and for setjmp/longjmp to work on some platforms, though possibly only Itanium).

I’m not sure what your distinction of Unwind vs EH ABI means. The Itanium ABI spec has two parts:

- The generic unwinder
- The language-specific EH personality functions (and throw / catch functions).

The generic unwinder is provided by libunwind (either the LLVM one or some derivative of the HP one), or by libgcc_s.so / libgcc_eh.a (depending on whether your statically or dynamically linking). This library normally also provides the C personality function (libgcc_s and compiler-rt both do), which does very little other than run cleanups.

The C++-specific functionality is in the C++ runtime library (libsupc++, libc++abi, or libcxxrt). For dynamic linking, this is normally provided indirectly, via libc++ or libstdc++ (either by statically linking it into the standard library or providing a linker script). For static linking of programs that don’t need the C++ standard library, it’s usually linked explicitly (e.g. $(CC) $(OBJECTS) -lsupc++, instead of $(CXX) $(OBJECTS)).

The core issue seems to be that libcompiler-rt.so does not provide the generic unwinder, which libgcc_s.so does. This is why in FreeBSD our libgcc_s.so is currently a mix of compiler-rt and libgcc_* code, and we are currently looking at bringing in the LLVM libUnwind.

The Compiler-RT repository does include an unwinder, so perhaps this should be built as libcompilerrt_eh.so for platforms that want to use -rtlib=compilerrt?

No, the unwinder is long gone to its own repository, because of a
cross-dependency with libc++abi.

I thought the unwinder only used C++ features that didn’t depend on the C++ runtime library (no exceptions, no RTTI) and only depended on some header-only things from elsewhere?

David

(...)
On some GNUish platforms, the unwinder is in libgcc_s.so or libgcc_eh.a (for static linking). Mixing generic unwind implementations is not supported by the Itanium ABI, so you need to pick one, typically the one that the platform defines. This also needs to be used for C code for __attribute__((cleanup)) to work (and for setjmp/longjmp to work on some platforms, though possibly only Itanium).

Great Scott! I got lost in the first paragraph... :frowning:

I’m not sure what your distinction of Unwind vs EH ABI means. The Itanium ABI spec has two parts:

- The generic unwinder
- The language-specific EH personality functions (and throw / catch functions).

I'm not an expert on this, but I thought libunwind had the generic
unwinder and libc++abi had bundled the EH specific, including the
personality routines, as well as the rest of the C++ ABI.

I thought the unwinder only used C++ features that didn’t depend on the C++ runtime library (no exceptions, no RTTI) and only depended on some header-only things from elsewhere?

I'm lost... :frowning:

So, the real bug here is what I introduced years ago: compiler-rt
should *not* automatically include libgcc_s nor _eh. This one I need
to revert, but I wanted to put something to replace it.

From the looks of it, it's going to be a lot more complicated than I

can possibly foresee. So, I think I should revert my change now
without a replacement. At least I can test libunwind and libc++abi
this way.

Fair enough?

--renato

That was my guess - these parts are not simple to disentangle.

David

The Compiler-RT repository does include an unwinder, so perhaps this should be built as libcompilerrt_eh.so for platforms that want to use -rtlib=compilerrt?

No, the unwinder is long gone to its own repository, because of a
cross-dependency with libc++abi.

I thought the unwinder only used C++ features that didn’t depend on the C++ runtime library (no exceptions, no RTTI) and only depended on some header-only things from elsewhere?

It does, but that wasn't the only layering violation. EHABI introduces a dependency in the wrong direction for some kinds of catch descriptors: https://github.com/llvm-mirror/libunwind/blob/master/src/Unwind-EHABI.cpp#L142

Jon

That looks commented out - presumably it could be made a weak symbol and only used if a C++ runtime is linked in?

David

I thought the unwinder only used C++ features that didn’t depend on the C++ runtime library (no exceptions, no RTTI) and only depended on some header-only things from elsewhere?

It does, but that wasn't the only layering violation. EHABI introduces a dependency in the wrong direction for some kinds of catch descriptors: https://github.com/llvm-mirror/libunwind/blob/master/src/Unwind-EHABI.cpp#L142

That looks commented out - presumably it could be made a weak symbol and only used if a C++ runtime is linked in?

It is commented out to break the backward dependency. To clarify, where I referred to EHABI above, I was speaking to the spec, not our implementation of it. We get away with not implementing all of it because Clang doesn't emit those kinds of descriptors.

A weak symbol would be a reasonable way to break the cycle if someone does want to implement it.

Jon