[RFC][libcxx] Fixing "-stdlib=libc++ -lc++abi" on Linux -- Exporting libc++abi through libc++.so.

Hi All,

As many of you know libc++ doesn't build correctly on Linux by
default. In particular the resulting library cannot be used with
"-stdlib=libc++" alone. This forces the user to manually link the
correct ABI library, usually by adding '-lc++abi' whenever they link a
program or library.

We need to fix the default libc++ build configuration so that
libc++.so just works. This fix cannot be done in clang. libc++
supports using 3+ different ABI libraries and even more ways to link
them. It's not feasible for clang to know how libc++ was built and
trying to teach it would limit the flexibility libc++ already has.

I would like to propose a solution to fix the default build
configuration. For simplicity I will explain it using libc++abi as the
example but any supported library will work.

Solution: Libc++ should use a linker script for libc++.so as opposed
to a symlink.
(See http://reviews.llvm.org/D12508)

The linker script would contain "INPUT(libc++.1.so -lc++abi)". This
instructs the linker to include libc++.1.so as if it were this file
and add '-lc++abi' to the link command. Currently libc++.so is a
symlink to libc++.so.1.

The most important part of this approach is that it allows us to use a
shared libc++abi.
We want to avoid using a static library because libc++abi contains
symbols which have to be unique within a process. These symbols
include exception handling globals, RTTI support, and definitions for
standard library exceptions. By keeping these symbols out of libc++ we
can allow different libc++ versions and even different standard
library implementations to coexist peacefully in the same process as
long as they share the same libc++abi.

This approach is how FreeBSD ships their systems libc++. OS X take a
slightly different approach which uses also uses a shared ABI library.
It works by re-exporting libc++abi's symbols into libc++.dylib at
build time using the "-reexported_symbols_list" linker flag.

Questions:

1. What Linux linkers, if any, don't support linker scripts? Does lld?
2. Would installing libc++.so as a linker script cause problems with ldconfig?
3. Does LD provide any way to reexport symbols lists comparable to how
"-reexported_symbols_list" on OS X?

I would love any and all input on this plan. Please correct me if I've
gotten any details wrong.
If nobody objects I would like to quickly move forward with this.

/Eric

There cannot be a usable Linux linker which does not at least have minimal
support for linker scripts, since libc.so is a linker script. But features
beyond those used there might be less universally supported.

$ cat /usr/lib/x86_64-linux-gnu/libc.so

/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib/x86_64-linux-gnu/libc.so.6
/usr/lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED (
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 ) )

Hi All,

As many of you know libc++ doesn't build correctly on Linux by
default. In particular the resulting library cannot be used with
"-stdlib=libc++" alone. This forces the user to manually link the
correct ABI library, usually by adding '-lc++abi' whenever they link a
program or library.

We need to fix the default libc++ build configuration so that
libc++.so just works. This fix cannot be done in clang. libc++
supports using 3+ different ABI libraries and even more ways to link
them. It's not feasible for clang to know how libc++ was built and
trying to teach it would limit the flexibility libc++ already has.

I would like to propose a solution to fix the default build
configuration. For simplicity I will explain it using libc++abi as the
example but any supported library will work.

Solution: Libc++ should use a linker script for libc++.so as opposed
to a symlink.
(See http://reviews.llvm.org/D12508)

The linker script would contain "INPUT(libc++.1.so -lc++abi)". This
instructs the linker to include libc++.1.so as if it were this file
and add '-lc++abi' to the link command. Currently libc++.so is a
symlink to libc++.so.1.

The most important part of this approach is that it allows us to use a
shared libc++abi.
We want to avoid using a static library because libc++abi contains
symbols which have to be unique within a process. These symbols
include exception handling globals, RTTI support, and definitions for
standard library exceptions. By keeping these symbols out of libc++ we
can allow different libc++ versions and even different standard
library implementations to coexist peacefully in the same process as
long as they share the same libc++abi.

I don't believe I understand why libc++abi cannot be statically linked into
libc++. (As an aside, this is how libstdc++ gets away with it).

Statically linking the ABI into the c++ library would ensure that the same
ABI is always used, it would be unique across the process still, and would
not impede using alternative ABI providers. Obviously, this comes at the
slight cost of increasing the size of libc++, though both libraries would
need to be loaded always. The other "downside" is that you cannot update
the ABI provider without rebuilding all of libc++. However, this is a
feature as well: you are certain that the ABI provider doesn't change (yes,
libc++abi is pretty stable, but hypothetically, we may fix UB -- like the
one with alignment of allocations done in libc++abi).

Doing this also has the added benefit of being much less Linux centric and
would allow us to use the same mechanism even on Windows with link.exe
(although, ld.bfd permits the use of linker scripts on windows).

I think that if the same ABI provider was used across multiple versions,
they could continue to coexist even if statically linked.

This approach is how FreeBSD ships their systems libc++. OS X take a

slightly different approach which uses also uses a shared ABI library.
It works by re-exporting libc++abi's symbols into libc++.dylib at
build time using the "-reexported_symbols_list" linker flag.

Questions:

1. What Linux linkers, if any, don't support linker scripts? Does lld?
2. Would installing libc++.so as a linker script cause problems with
ldconfig?
3. Does LD provide any way to reexport symbols lists comparable to how
"-reexported_symbols_list" on OS X?

I would love any and all input on this plan. Please correct me if I've
gotten any details wrong.
If nobody objects I would like to quickly move forward with this.

/Eric

This is how FreeBSD ships libc++ (except that we use libcxxrt, not libc++abi), but it will cause problems on Linux unless you *also* change the way that libstdc++ is shipped.

Libstdc++ statically links libsupc++ (the GNU equivalent of libcxxrt / libc++abi) and this has an impact on symbol versioning. If something links to both libstdc++ with libsupc++ and libc++ with libc++abi then the ABI symbols will be distinct. Throwing an exception in code using one and catching it in the other will break in difficult-to-diagnose ways.

We addressed this by linking libstdc++ as a filter library against libsupc++ then against libcxxrt. Libraries linking libstdc++ see the ABI symbols as having come from libstdc++, when in reality they are provided by libcxxrt. This allows interoperability.

If you can not modify the linkage of libstdc++ on your target platform, then the best solution is to use libstdc++ itself as the ABI library.

David

Hi Saleem,

Thanks for the response.

Statically linking the ABI into the c++ library would ensure that the same ABI is always used, it would be unique across the process still, and would not impede using alternative ABI providers.

I was originally under the impression that this would prevent multiple libc++ versions from existing in the same executable because they would each contain a different copy of the libc++abi symbols.
Am I incorrect in this assertion?

I also don’t see how using a linker script would impede using alternative ABI providers. The script could be tailored to support any of them.

The other “downside” is that you cannot update the ABI provider without rebuilding all of libc++. However, this is a feature as well: you are certain that the ABI provider doesn’t change (yes, libc++abi is pretty stable, but hypothetically, we may fix UB – like the one with alignment of allocations done in libc++abi).

This is a major downside for libc++abi developers if we make this the default behavior. I also don’t think people expect perfect ABI stability when building a ToT release of libc++. I agree with both your points but I don’t think they are reasons to use this configuration as the default configuration.

/Eric

Hi David,

Thanks for taking the time to respond.

If you can not modify the linkage of libstdc++ on your target platform, then the best solution is to use libstdc++ itself as the ABI library.

Obviously I have no power to modify the linkage of libstdc++ in general. However I don’t think using libstdc++ as the ABI library is the best default for libc++.
I understand it is needed in order to support inter-operability but for obvious reasons I think libc++abi is the correct default to use out of the box.

However the idea of using a linker script as libc++.so will also fix the case where libstdc++.so is actually used.
When the user builds against libstdc++ the linker script will contain “INPUT(libc++.so.1 -lstdc++)”.

/Eric

If you can not modify the linkage of libstdc++ on your target platform, then the best solution is to use libstdc++ itself as the ABI library.

Do you mean only libsupc++ or the whole libstdc++?

Back to Erik's proposal, the clang driver had some similar logic to
Compiler-RT that was added and removed by me given the solutions and
complications I found.

From compiler-rt, on GNU systems, I added libgcc_s or libgcc_eh

automatically, so one could use "--rtlib=compiler-rt" and get
everything working out of the box. But that breaks out when you want
to use libunwind, in the exact same way Erik's proposal will break if
it includes either libsupc++ or libc++abi.

Not controlling the base architecture is a problem, given that all
your dynamic objects already link and use whatever the platform gives
you. This takes us back to the drawing board.

What do we want from our high level libraries?

* Independent, good quality libraries? In which case testing
compatibility with other base libraries is unnecessary and all we test
are statically linked (or controlled dynamically linked) environments.

* Inter-operational, sometimes redundant libraries? In which case we
assume the target always has libgcc / libsupc++ and only ever replace
compiler-rt and libc++, not the other two.

I believe both solutions have huge problems, but I can't see the few
of us actually supporting *well* both use cases at the same time.

Ideas?

cheers,
--renato

Hi Renato,

Do you mean only libsupc++ or the whole libstdc++?

I think David meant that by default, libsupc++ is only provided by way of libstdc++.so and we can’t control that. For that reason it’s best to link against the whole libstdc++.so so we can get libsupc++.

Erik’s proposal will break if it includes either libsupc++ or libc++abi.

I’m a bit confused. My proposal can deal with both libsupc++ and libc++abi. Could you elaborate on the issues you see?

I think David meant that by default, libsupc++ is only provided by way of
libstdc++.so and we can't control that. For that reason it's best to link
against the whole libstdc++.so so we can get libsupc++.

Right. No breakages, but a lot of unused duplicated code.

I'm a bit confused. My proposal can deal with both libsupc++ and libc++abi.
Could you elaborate on the issues you see?

No issues. I assumed the wrong thing.

I need to police myself to use questions instead of statements when
I'm asking something. :slight_smile:

cheers,
--renato

I’m going to move forward with this plan tonight assuming nobody objects. I have spoken to Saleem privately and he also agrees this is a good choice.

Thanks for all of the input.

/Eric

As of r250469 libc++ now installs a linker script by default. I'm off to
watch the bots in case I got any of the CMake wrong.

/Eric

Thanks Eric!