RFC: Clang and libstdc++ versions, string.h and cstring conflicts, and glibc patches.

Hi all,

So I have this local Clang-related glibc patch that I've been trying
to push upstream, and it's opened a bit of a can of worms that I would
like some advice on....

tl;dr: If string.h in glibc 2.19 checks for __clang__ and breaks
compatibility with libstdc++ 4.3 if it's found, is that bad? What if
it checks for a recent version of Clang?

The context is straightforward: In C++, some of the functions in
strings.h have different declarations than they do in C. You can see
this easily in libc++'s cstring, where there's a block of inline
functions that are guarded by an ifdef -- those are there to account
for C++ inclusions of string.h that are not C++-aware.
http://llvm.org/svn/llvm-project/libcxx/trunk/include/cstring

(There's a closely-related issue with wcschr, which was a subject of
another recent email:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-August/031618.html)

Where this gets complicated is in coordinating the C and C++ libraries
to make sure there is only one version of these definitions.
Originally, glibc's headers didn't define these, and so they were
defined in libstdc++ and libc++ headers. All is good, except for the
fact that glibc's headers contain wrong-for-C++ definitions instead.

In 2009, pre-GCC 4.4, a coordinated pair of patches went in to define
these in glibc and to avoid defining them in libstdc++ if they were
defined in glibc. But those aren't synchronized projects -- so in
glibc's string.h, these are only defined if the GCC version is 4.4 or
higher, and string.h defines special macros that the libstdc++ headers
use to determine what to do.

This, of course, broke libc++ when compiling with GCC 4.4, which led
to bug 7983; the solution to that was to simply not include the
definitions in cstring when using any version of glibc. (This breaks
libc++ with older glibc or GCC versions; I presume that was considered
irrelevant.)

And that brings us to the problem we have today: compiling C++ code
with Clang. Clang reports itself as having compatibility with GCC 4.2
-- and thus glibc's string.h assumes it's dealing with a version-4.2
libstdc++ and exports the old wrong-for-C++ C declarations instead of
the new C++ ones.

This leads to some subtle flavors of sadness, and so we have a
Google-local patch that changes glibc's strings.h check to check for
__clang__ as well as for GCC version of 4.4 or later. I proposed that
upstream:
http://sourceware.org/ml/libc-alpha/2013-08/msg00556.html

The glibc maintainers pointed out that this will cause problems if
someone is using Clang and glibc 2.19 or later but also using a
libstdc++ that's older than 4.4, or a libc++ before the fix for bug
7983 from late 2010.

So, question 1: Are these use-cases that actually need to be
supported, or are those sufficiently mismatched versions that we can
just say "don't do that"?

One of the glibc maintainers also pointed out that checking for
__clang__ is specific to one non-GCC compiler when there are many that
might be affected, and suggested using __cplusplus >= 199701L instead.
Because GCC and Clang had defined this with a non-standard value of 1
until recently (May 2012 for Clang), this would apply broadly but only
catch new versions.

Question 2: Would it be objectionable to accept this solution? It
limits the version combinations that break for question 1 (in that you
now need a new glibc _and_ a new Clang along with the old libstdc++ to
get breakage), but it also means not-new versions of Clang don't get
the fix either.

I'd appreciate any advice and opinions on these. My inclination is to
reply to the glibc review with a patch that checks __cplusplus, and
call that "good enough".

- Brooks

Hi all,

So I have this local Clang-related glibc patch that I've been trying
to push upstream, and it's opened a bit of a can of worms that I would
like some advice on....

tl;dr: If string.h in glibc 2.19 checks for __clang__ and breaks
compatibility with libstdc++ 4.3 if it's found, is that bad? What if
it checks for a recent version of Clang?

The context is straightforward: In C++, some of the functions in
strings.h have different declarations than they do in C. You can see
this easily in libc++'s cstring, where there's a block of inline
functions that are guarded by an ifdef -- those are there to account
for C++ inclusions of string.h that are not C++-aware.
http://llvm.org/svn/llvm-project/libcxx/trunk/include/cstring

(There's a closely-related issue with wcschr, which was a subject of
another recent email:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-August/031618.html)

Where this gets complicated is in coordinating the C and C++ libraries
to make sure there is only one version of these definitions.
Originally, glibc's headers didn't define these, and so they were
defined in libstdc++ and libc++ headers. All is good, except for the
fact that glibc's headers contain wrong-for-C++ definitions instead.

In 2009, pre-GCC 4.4, a coordinated pair of patches went in to define
these in glibc and to avoid defining them in libstdc++ if they were
defined in glibc. But those aren't synchronized projects -- so in
glibc's string.h, these are only defined if the GCC version is 4.4 or
higher, and string.h defines special macros that the libstdc++ headers
use to determine what to do.

This, of course, broke libc++ when compiling with GCC 4.4, which led
to bug 7983; the solution to that was to simply not include the
definitions in cstring when using any version of glibc. (This breaks
libc++ with older glibc or GCC versions; I presume that was considered
irrelevant.)

And that brings us to the problem we have today: compiling C++ code
with Clang. Clang reports itself as having compatibility with GCC 4.2
-- and thus glibc's string.h assumes it's dealing with a version-4.2
libstdc++ and exports the old wrong-for-C++ C declarations instead of
the new C++ ones.

This leads to some subtle flavors of sadness, and so we have a
Google-local patch that changes glibc's strings.h check to check for
__clang__ as well as for GCC version of 4.4 or later. I proposed that
upstream:
http://sourceware.org/ml/libc-alpha/2013-08/msg00556.html

The glibc maintainers pointed out that this will cause problems if
someone is using Clang and glibc 2.19 or later but also using a
libstdc++ that's older than 4.4, or a libc++ before the fix for bug
7983 from late 2010.

/*
libstdc++ consumer older than 4.4... those people get what they deserve :wink:
*/

So, question 1: Are these use-cases that actually need to be
supported, or are those sufficiently mismatched versions that we can
just say "don't do that"?

One of the glibc maintainers also pointed out that checking for
__clang__ is specific to one non-GCC compiler when there are many that
might be affected, and suggested using __cplusplus >= 199701L instead.
  Because GCC and Clang had defined this with a non-standard value of 1
until recently (May 2012 for Clang), this would apply broadly but only
catch new versions.

Sorry for the peanut gallery comment, but this isn't exactly true...

Anyone using a clang based front-end will probably be forced to define __clang__. This isn't common, but there's at least 1 and possibly 2 or more non-llvm based compilers doing this that I'm aware of. In the cases that I'm aware of - those modified versions of clang are post-may-2012 though.

please don't depend on non-standard values though...

(PathScale compiler is recent modified clang + recent libc++ || STDCXX and while we could in theory play nice with the system libstdc++, we don't ever use it beyond some internal testing.)

Question 2: Would it be objectionable to accept this solution? It
limits the version combinations that break for question 1 (in that you
now need a new glibc _and_ a new Clang along with the old libstdc++ to
get breakage), but it also means not-new versions of Clang don't get
the fix either.

Your solution and limitations seem reasonable to me (for whatever that's worth)

I think if someone *really* cared about that older version of clang they could - a) backport the fix or b) submit a patch after yours. Unlike GCC which has some caveats between releases - clang's evolution is quite "stable" and I don't know of any solid arguments which force people not to update.

I'd appreciate any advice and opinions on these. My inclination is to
reply to the glibc review with a patch that checks __cplusplus, and
call that "good enough".

please no non-standard or standard breaking things (I prefer either option 1 or 2 below)

1) introduce yet another -D define
2) go with a __clang__ version check
3) if you explicitly want this to only apply to clang+llvm - then add a check for llvm's define as well

Hi all,

So I have this local Clang-related glibc patch that I've been trying
to push upstream, and it's opened a bit of a can of worms that I would
like some advice on....

tl;dr: If string.h in glibc 2.19 checks for __clang__ and breaks
compatibility with libstdc++ 4.3 if it's found, is that bad? What if
it checks for a recent version of Clang?

The context is straightforward: In C++, some of the functions in
strings.h have different declarations than they do in C. You can see
this easily in libc++'s cstring, where there's a block of inline
functions that are guarded by an ifdef -- those are there to account
for C++ inclusions of string.h that are not C++-aware.
http://llvm.org/svn/llvm-project/libcxx/trunk/include/cstring

(There's a closely-related issue with wcschr, which was a subject of
another recent email:
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-August/031618.html)

Where this gets complicated is in coordinating the C and C++ libraries
to make sure there is only one version of these definitions.
Originally, glibc's headers didn't define these, and so they were
defined in libstdc++ and libc++ headers. All is good, except for the
fact that glibc's headers contain wrong-for-C++ definitions instead.

In 2009, pre-GCC 4.4, a coordinated pair of patches went in to define
these in glibc and to avoid defining them in libstdc++ if they were
defined in glibc. But those aren't synchronized projects -- so in
glibc's string.h, these are only defined if the GCC version is 4.4 or
higher, and string.h defines special macros that the libstdc++ headers
use to determine what to do.

This, of course, broke libc++ when compiling with GCC 4.4, which led
to bug 7983; the solution to that was to simply not include the
definitions in cstring when using any version of glibc. (This breaks
libc++ with older glibc or GCC versions; I presume that was considered
irrelevant.)

And that brings us to the problem we have today: compiling C++ code
with Clang. Clang reports itself as having compatibility with GCC 4.2
-- and thus glibc's string.h assumes it's dealing with a version-4.2
libstdc++ and exports the old wrong-for-C++ C declarations instead of
the new C++ ones.

This leads to some subtle flavors of sadness, and so we have a
Google-local patch that changes glibc's strings.h check to check for
__clang__ as well as for GCC version of 4.4 or later. I proposed that
upstream:
http://sourceware.org/ml/libc-alpha/2013-08/msg00556.html

The glibc maintainers pointed out that this will cause problems if
someone is using Clang and glibc 2.19 or later but also using a
libstdc++ that's older than 4.4, or a libc++ before the fix for bug
7983 from late 2010.

So, question 1: Are these use-cases that actually need to be
supported, or are those sufficiently mismatched versions that we can
just say "don't do that"?

The case of an old libc++ doesn't matter; nobody was using libc++ in late
2010.

The case of an old libstdc++ is more interesting... we might care? In
practice, any of the solutions is probably fine: glibc gets upgraded along
with the system, which implies a newer gcc, which implies clang will find a
newer libstdc++.

One of the glibc maintainers also pointed out that checking for
__clang__ is specific to one non-GCC compiler when there are many that
might be affected, and suggested using __cplusplus >= 199701L instead.
Because GCC and Clang had defined this with a non-standard value of 1
until recently (May 2012 for Clang), this would apply broadly but only
catch new versions.

Question 2: Would it be objectionable to accept this solution? It
limits the version combinations that break for question 1 (in that you
now need a new glibc _and_ a new Clang along with the old libstdc++ to
get breakage), but it also means not-new versions of Clang don't get
the fix either.

The "right" solution is really something like "#if _LIBCPP_VERSION ||
__GLIBCXX__ > 20070719" (replace 20070719 with the right date), but I don't
know if there's any reliable way to get the right definition of
__GLIBCXX__. (You should be able to get the definition of _LIBCPP_VERSION
without causing circular dependencies by including ciso646.)

Barring that, there isn't any solution that doesn't break the "new clang
with old libstdc++" case, which is the only case which might possibly
matter, so it's hard to say one solution is better than the other.

-Eli

[Replying to both replies in one email]

Anyone using a clang based front-end will probably be forced to define
__clang__. This isn't common, but there's at least 1 and possibly 2 or more
non-llvm based compilers doing this that I'm aware of. In the cases that I'm
aware of - those modified versions of clang are post-may-2012 though.

Thanks for the clarification!

(PathScale compiler is recent modified clang + recent libc++ || STDCXX and
while we could in theory play nice with the system libstdc++, we don't ever
use it beyond some internal testing.)

Ok. As full disclosure, this may break STDCXX -- but if it does, the
fix is straightforward. Glibc's string.h defines a
__CORRECT_ISO_CPP_STRING_H_PROTO macro when it provides the correct
C++ prototypes, and STDCXX can check that to turn them off. As a
bonus, this will also fix STDCXX's equivalent of libc++ bug 7983 for
compilation with GCC 4.4 and later.

please no non-standard or standard breaking things (I prefer either option 1
or 2 below)

I'm not sure I understand your argument that the __cplusplus proposal
is either non-standard or standard-breaking. Would you mind
explaining that reasoning?

The value of 199701L for __cplusplus is of course completely standard.
What checking for that would mean is that glibc's string.h would
export the correct C++-standard function definitions when it's
compiled with any compiler that has the correct C++-standard value for
__cplusplus -- which I'd say is precisely standard-conforming
behavior. (This then implies that it's relying on that compiler to
use a library with a cstring header that correctly deals with a
string.h that follows the standard.)

The old __cplusplus value of 1 also happens to have meaning in the C++
standard; the standard states that non-conforming implementations
should use values with 5 or fewer decimal digits. It seems to me that
it's reasonable for glibc to maintain the current (non-standard)
string.h definitions in that case.

2) go with a __clang__ version check

What version check would you recommend, if you would recommend
something other than just checking for whether __clang__ is defined?

The "right" solution is really something like "#if _LIBCPP_VERSION ||
__GLIBCXX__ > 20070719" (replace 20070719 with the right date), but I don't
know if there's any reliable way to get the right definition of __GLIBCXX__.
(You should be able to get the definition of _LIBCPP_VERSION without causing
circular dependencies by including ciso646.)

Unfortunately, it's even messier than that -- there is no right date,
because the __GLIBCXX__ values are purely release dates, and (for
example) GCC 4.3.6 was released after 4.6.0. We'd need to check for
about a half-dozen different specific values in the overlap. (Also,
including ciso646 doesn't define __GLIBCXX__ in libstdc++, more's the
pity.)

Thanks for the feedback!
- Brooks