Current state of libcxx compatibility with libstdc++

[ This is probably a FAQ, however Google gives a lot of conflicting and outdated information… ]

We are trying to add support for compiling R packages with C++11 code on CentOS 6/7 using standard compilers from CentOS/EPEL repositories. Because gcc/libstdc++ on CentOS are too old (4.4.7), Tom Callaway has recently added libcxx and libcxxabi [1,2] to Fedora/EPEL. We can now build C++11 code natively on CentOS with clang:

clang++ -stdlib=libc++ -std=c++11 example.cpp

Awesome. However for code which links to c++ system libraries (which are compiled with gcc) this sometimes results in linking errors. For example a test application geometry.cpp [3] which links to the “ImageMagick-c+±devel” package:

clang++ -stdlib=libc++ -std=c++11 geometry.cpp $(pkg-config --cflags --libs ImageMagick++)’

This fails with:

geometry.cpp:(.text+0xd9): undefined reference to Magick::Geometry::Geometry(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)' geometry.cpp:(.text+0x2ed): undefined reference to Magick::Geometry::operator std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator >() const’

Is this expected? Is there any way we can work around this to link c++ programs compiled with libcxx against system libraries which are compiled with gcc? If not, might this be possible for future versions of libcxx or are the two simply incompatible on this level?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1332306
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1332307
[3] https://github.com/ImageMagick/ImageMagick/blob/master/Magick%2B%2B/tests/geometry.cpp

[ This is probably a FAQ, however Google gives a lot of conflicting and
outdated information... ]

We are trying to add support for compiling R packages with C++11 code on
CentOS 6/7 using standard compilers from CentOS/EPEL repositories.
Because gcc/libstdc++ on CentOS are too old (4.4.7), Tom Callaway has
recently added libcxx and libcxxabi [1,2] to Fedora/EPEL. We can now
build C++11 code natively on CentOS with clang:

  clang++ -stdlib=libc++ -std=c++11 example.cpp

Awesome. However for code which links to c++ system libraries (which are
compiled with gcc) this sometimes results in linking errors. For example
a test application geometry.cpp [3] which links to the
"ImageMagick-c++-devel" package:

  clang++ -stdlib=libc++ -std=c++11 geometry.cpp $(pkg-config --cflags
--libs ImageMagick++)'

This fails with:

  geometry.cpp:(.text+0xd9): undefined reference to
`Magick::Geometry::Geometry(std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> > const&)'
  geometry.cpp:(.text+0x2ed): undefined reference to
`Magick::Geometry::operator std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char> >() const'

Is this expected?

Yes.

Is there any way we can work around this to link c++
programs compiled with libcxx against system libraries which are
compiled with gcc?

"compiled with gcc" isn't the issue. The issue is that you have one set of objects *built against libstdc++* and you're trying to link them against another set *built against libcxx*. This can only work if your interface between them does not contain things from the standard library.

Let me re-iterate: it is ok to link against both standard libraries (their symbols won't collide). What's not going to work is to use the headers from one library against the binaries from the other.

If not, might this be possible for future versions of
libcxx or are the two simply incompatible on this level?

The compatibility you're looking for here is never going to happen.

Jon

More specifically:

- It is not likely to be okay to mix libsupc++ (typically statically linked with libstdc++) and libc++abi in the same process. They declare the same symbols and stuff is likely to break in exciting ways due to the conflicts. We solved this in FreeBSD by linking both libstdc++ and libc++ against libcxxrt. Other Linux distros have solved it by linking libc++ to libstdc++.

- It is fine to mix libstdc++ and libc++ in the same program, but it is *not* fine to pass standard library objects across library boundaries between parts of the code that use libstdc++ and parts that use libc++. You can not, for example, pass a std::string from a program using libc++ to a library using libc++. Most of these will result in link failures, but there are some cases that will just cause exciting run-time failures when you either bypass the type system, expect dynamic_cast (or any_cast) to work, or throw standard library objects other than std::exception in between different parts of the code.

David

It is usually not even possible to pass Standard C++ objects between libraries that have been built with different versions of the "same" implementation of Standard C++ libraries, as it is not at all unusual for the library implementator to refactor the implementation of the Standard types, with a substantial amount of the implementation residing in the headers.

The compilers themselves generally go to considerable lengths to ensure that the same code compiled with different versions of the compiler are binary compatible, but the C++ Standard Library headers usually break object compatibility because the code is "not" the same code.

I haven't checked recently (v3.8 to v3.9), but this is usually the case from one release of CLang's own LibC++ to another.

  MartinO

The better option for using newer C++ standards on RHEL machines is the
newer versions of GCC that come with the Developer Toolset:

They are ABI compatible with the system version of GCC and the binary's
that are generated have the new features statically linked in so they can
be distributed to machines without needing to install the Developer Toolset.

Binary (backwards) compatibility is most definitely a goal for libc++. Please file bugs if you find them - any ABI-breaking changes should be behind #ifdefs so that downstream users can enable them on their own timescales. I believe libstdc++ also tries to provide these guarantees, though they did need some ABI-breaking changes for c++11 compatibility.

David

Thanks David, this is very interesting to me.

For many years I had given training on using C++ effectively, and sadly I
have had to teach my classes "not" to use Standard C++ types when passing
information between a function caller and the callee for libraries because
of versionability.

For instance, it seems to me (ideally) that we should be able to write
something like:

   extern std::list<int> foo(std::list<int>);

and provide this as a portable interface between a 3rd party library
(provided as a '.a', '.so', '.lib' or '.dll'), but I have learned through
painful experience that this is usually not the case. It may be "source"
compatible, but the real world requires "binary" compatible.

So while the compiler writers go to extreme measures to ensure that "source
code in" yields a binary compatible "object code out", the people who
maintain the C++ libraries do not generally ensure the same constraints.

Now this is not to say that Version N+1 is worse than Version N, because in
general the implementers ensure that Version N+1 is faster/smaller/better
than its predecessor; but unfortunately this often means that internal
implementation details such as the introduction of a new clever helper
template class, results in radically different "actual" implementations,
even though the "black box" appears to be the same from the perspective of
the compilee (invented term?).

Many times I have found that this breaks my natural interface expression
which I find really sad, because for many years I had tried to teach people
to use the Standard containers and other Standard abstractions, because they
are better tested, more performant, and generally better than crafting your
own. But the sad bit is that the "roll your own" is generally more
portable, and this means I have 2 solutions:

  o Use a C compatible interface for my libraries
  o Use my own "home cooked" alternatives for Standard containers

I don't like either solution. The first means that I cannot use C++ to
express natural version portable interfaces between my library
implementations and my library consumers, and after 30 years of advocating
the benefits of C++ I find this truly painful. While the second means that
I have deliberately had to do exactly what I have been teaching is "wrong"
and introduce my own (well-intended) alternative implementations to the
Standard types that are possibly (and probably) wrong in some unintended
way.

This has been a big problem for me in the past, and while I have not yet
checked LibC++ v3.8 to v3.9, the transition from v3.6 to v3.7 had a very big
impact, almost trebling the in-memory image for applications as simple as:

  #include <iostream>

  int main() {
    std::cout << "Hello World" << std::endl;
  }

And while the interface as expressed in the headers for the Standard C++
interface was 100% correct, the choices behind the scenes had terrible
consequences for space requirements, which for an embedded system is
critical. The changes from v3.7 to v3.8 had a profound negative impact on
legacy C code that was being compiled using the new set of C++ headers (e/g/
'<math.h>') so that I had to deliberately break from LibC++ in this regard
and implement a more truly "C compatible" variant. A very large amount of
embedded system code is trivially "ported" from C to C++ expecting that the
term "C compatible" means what it says on the can - but it doesn't.

When using VC++, I have found that almost every revision introduces altered
implementation strategies for the Standard C++ interafaces that breaks
binary compatibility between versions, and requires that the library and its
consumer are built with the same version. In the Open Source world this is
not a truly breaking issues, but it is a big issues when 3rd parties are
making a living by providing libraries for others to use. And if you have
no access to the source code, you are bound to the compiler version that the
3rd party library provider used to build their libraries.

I am not picking on VC++, it is my favourite workday compiler, and it is the
compiler that I most often use (having a long legacy with that compiler).
But the same issues arise between distributions of GCC too, and indeed other
C++ implementations including CLang/LLVM/LibC++.

So this has gone off on a tangent from whether or not LibC++ should be
binary compatible with 'libstdc++' (sorry for hijacking that discussion) -
and for the record I don't think that they "should" be, and I don't think
that they "can" be - but it does raise the broader issue regarding whether
successive versions of a C++ library from the same vendor (LLVM/CLang in
this case) should be binary compatible. And this is a trade-off - either we
have 100% implementation hiding (i.e. a true black-box interface) which will
incur a runtime penalty (space and/or time) - or we have a compromise which
allows the implementer to provide more efficient (space/time)
implementations with successive revisions, but at the expense of binary
portability.

There is no perfect answer.

  MartinO

You should look at this paper:

I haven't heard much activity on it since it was released, so maybe it was discussed and dropped.

Thanks Craig.

Yes, I had read this paper when Herb submitted it, and it really does say the same things; and that must have been more than 2 years ago.

Even earlier, I and others wrote the TR on embedded system (Technical Report on C++ Performance [ISO-IEC TR 18015-2006(E)]) and that was 10 years ago!

None of these problems are new unfortunately, but they still are not solved.

  MartinO