Keyword warnings in libc++'s type_traits and other headers

Hi,

I ran into a situation where a C++ program was compiled with -Wsystem-headers. When I did this with clang 3.4 or trunk, I got the following keyword warnings:

In file included from include/algorithm:624:
include/type_traits:288:29: warning: keyword '__is_void' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
template <class _Tp> struct __is_void : public false_type {};
                           ^
include/type_traits:309:29: warning: keyword '__is_integral' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
template <class _Tp> struct __is_integral : public false_type {};
                           ^
include/type_traits:333:29: warning: keyword '__is_floating_point' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
template <class _Tp> struct __is_floating_point : public false_type {};
                           ^
include/type_traits:352:29: warning: keyword '__is_pointer' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
template <class _Tp> struct __is_pointer : public false_type {};
                           ^
include/type_traits:432:8: warning: keyword '__is_function' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
struct __is_function
      ^
include/type_traits:442:40: warning: keyword '__is_member_function_pointer' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
template <class _Tp> struct __is_member_function_pointer : public false_type {};
                                      ^
include/type_traits:450:40: warning: keyword '__is_member_pointer' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
template <class _Tp> struct __is_member_pointer : public false_type {};
                                      ^
include/type_traits:653:8: warning: keyword '__is_signed' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
struct __is_signed : public ___is_signed<_Tp> {};
      ^
include/type_traits:668:8: warning: keyword '__is_unsigned' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
struct __is_unsigned : public ___is_unsigned<_Tp> {};
      ^
9 warnings generated.

This seems to have been introduced with r196212 in clang by Alp Toker, but it is unfortunate the warning hits libc++. :slight_smile: The cause is a bunch of Embarcadero keywords defined in clang's lib/Parse/ParseExpr.cpp, which are exactly the same as these libc++-internal identifers.

Is the attached patch acceptable as a workaround?

-Dimitry

avoid-keywords-1.diff (8.37 KB)

Hi,

I ran into a situation where a C++ program was compiled with -Wsystem-headers. When I did this with clang 3.4 or trunk, I got the following keyword warnings:

[snipped]

include/type_traits:668:8: warning: keyword '__is_unsigned' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
struct __is_unsigned : public ___is_unsigned<_Tp> {};
       ^
9 warnings generated.

This seems to have been introduced with r196212 in clang by Alp Toker, but it is unfortunate the warning hits libc++. :slight_smile: The cause is a bunch of Embarcadero keywords defined in clang's lib/Parse/ParseExpr.cpp, which are exactly the same as these libc++-internal identifers.

Is the attached patch acceptable as a workaround?

Hi Dimitry,

Thanks for looking into this. Your analysis of the situation is correct and the patch for libc++ is headed in the right direction.

Beyond names we raise warnings for right now, all names starting __is_* and to a lesser extent __has_* are best considered reserved prefixes for maximum compatibility going forward.

This is suggested out of practicality given that new type trait intrinsics are proposed all the time and will be handled as straight keywords, not just by clang but all mainstream compilers.

The existing keywords were only accepted as type names by clang due to an old compatibility hack to support GNU stdc++ headers. It's invasive to the parser and changes the way tokens are handled so we're warning on this starting 3.4 to get system headers fixed. New code should not be relying on the fallback.

The quad-underscore prefix in your patch is a matter of taste. Another option may be to use a different naming convention or just a single underscore, putting the structs in a private namespace so they aren't visible to user code.

Whichever approach you take, I recommend clearing all of the __is_* and __has_* occurrences, not just ones that are keywords today, in order to avoid a new conflict each time an intrinsic is added to the compiler or a different language standard is selected. Keyword names are predictable and highly likely to follow the existing convention.

I further recommend that libc++ switches on system header warnings in the development build. We're adding new diagnostics in clang intended to assist standard library developers and it's a shame to suppress them during the LLVM development and testing cycle.

Alp.

I am strongly against this patch on several levels.

  1. I would like to understand why (and how we came to) have two different sets of type traits implementations with the same name; one in libc++ and the other in the Embarcadero version of clang. Or maybe it’s not just in the Embarcadero version - though they are listed in TokenKinds.def as "Embarcadero Type Traits”

2). I would like to have a discussion about the desirability of having these in the compiler; clearly most of these type traits are implementable purely as a library.

  1. I hate the idea of using “quad underscores” to differentiate names; I spent two hours a couple months ago trying to figure out why my code was failing, only to determine that I was calling __xxx instead of ___xxx. (triple underscore vs. double) If we decide to change libc++ for this, I would recommend renaming __is_void to something like __libcpp_is_void - something visually different.

— Marshall

Hi,

I ran into a situation where a C++ program was compiled with -Wsystem-headers. When I did this with clang 3.4 or trunk, I got the following keyword warnings:

[snipped]

include/type_traits:668:8: warning: keyword '__is_unsigned' will be made available as an identifier for the remainder of the translation unit [-Wkeyword-compat]
struct __is_unsigned : public ___is_unsigned<_Tp> {};
      ^
9 warnings generated.

This seems to have been introduced with r196212 in clang by Alp Toker, but it is unfortunate the warning hits libc++. :slight_smile: The cause is a bunch of Embarcadero keywords defined in clang's lib/Parse/ParseExpr.cpp, which are exactly the same as these libc++-internal identifers.

Is the attached patch acceptable as a workaround?

Hi Dimitry,

Thanks for looking into this. Your analysis of the situation is correct and the patch for libc++ is headed in the right direction.

I am strongly against this patch on several levels.

1) I would like to understand why (and how we came to) have two different sets of type traits implementations with the same name; one in libc++ and the other in the Embarcadero version of clang. Or maybe it’s not just in the Embarcadero version - though they are listed in TokenKinds.def as "Embarcadero Type Traits”

I suspect those trait intrinsics were added to support another C++ standard library, the same way certain intrinsics have been added to clang to support libc++.

clang is used by many vendors and I'm aware of various commercial and free C++ standard libraries we support, both in C++ compliance mode and in compatibility mode.

Thankfully most of those vendors are active or responsive contributors to LLVM, or as in the case of MSVC there's a strong ongoing push to achieve drop-in compatibility from the core clang team.

None of the trait intrinsics are inherently better or worse than others, and all are reserved as an implementation detail intended to help clang provide full-spectrum C++ support.

2). I would like to have a discussion about the *desirability* of having these in the compiler; clearly most of these type traits are implementable purely as a library.

It's a perfectly legitimate topic to discuss the merits of library vs. intrinsic implementations but that conversation has to be between you and the vendors of those other C++ standard library developers.

If you reach a consensus and they ship your proposed changes, we can begin a cycle of deprecation until the intrinsics in question are no longer in use.

As a maintainer of this code I can state that the intrinsics are not a burden at present and have low cost to keep in tree, but if the time comes some years down the line that they are no longer in use we'll be sure to remove them.

Keep in mind however that some of these type traits are not implementable in library form, which may have been the reason intrinsics were selected in the first place for any particular C++ type trait.

Furthermore, even when a library version is possible, a thin header structure wrapping a quality compiler intrinsic can yield significantly efficiency gains. Consider that clang will be able to evaluate certain type traits instantly without requiring deserialization of AST nodes from pre-compiled headers or modules, sparing costly operations to materialize long chains of dependent declarations.

It's my hope that we can coordinate with libc++ on ground-breaking features like the one I described instead of nit-picking over reserved namespaces.

3) I *hate* the idea of using “quad underscores” to differentiate names; I spent *two hours* a couple months ago trying to figure out why my code was failing, only to determine that I was calling __xxx instead of ___xxx. (triple underscore vs. double) If we decide to change libc++ for this, I would recommend renaming __is_void to something like __libcpp_is_void - something visually different.

Underscores are a matter of style and it's your baby. Full credit to Dimitry for proposing the initial fix and I'm sure he'll be responsive to your review comments.

I agree that quad-underscores are less than ideal. Something like __libcpp_* is easier on the eye and will help you avoid conflicts with any compiler's intrinsics.

Keep in mind that where clang warns today, any other compiler already issues a hard error so we're making the effort to provide due notice. That said, we're starting to look to remove GNU header compatibility hacks from the parser and it was never the plan that libc++ would begin to use them.

tl;dr: There's no linkage or visibility so no negative impact in doing a search-and-replace to make your headers parsable without hacks?

Alp.

I’m still confused.
I tried the following tests using today’s clang (and libc++):

int main () { const int i = __is_void::value; }
fails to compile.

int main () { const int i = std::__is_void::value; }
fails to compile.

#include <type_traits>
int main () { const int i = __is_void::value; }
fails to compile.

#include <type_traits>
int main () { const int i = std::__is_void::value; }
compiles successfully.

All were compiled with "-std=c++11 -stdlib=libc++”
If I add -Wsystem-headers to the final one, I get warnings like what Dimitry reported.

— Marshall

Two further questions:

1. Do these keywords come with __has_feature tests (like __is_class does)? If not, they should.

2. If they do come with __has_feature tests, how about libc++ doing the same thing it does for is_class et al.?

#if !__has_feature(is_void)

template <class _Tp> struct __is_void : public false_type {};
template <> struct __is_void<void> : public true_type {};

#endif // !__has_feature(is_void)

template <class _Tp> struct _LIBCPP_TYPE_VIS_ONLY is_void
    : public __is_void<typename remove_cv<_Tp>::type> {};

That way we don't need major surgery. We avoid the conflict if it exists. And we take advantage of the compiler intrinsic if it exists.

Is there a conflict that can not be handled in this way?

Howard

Hi Marshall, I’ll step you through each of the cases you posted. In your first example: __is_void is handled as an intrinsic keyword which the parser expects to be written as a functional call. Try the following to get it to compile: In your second example: std:: is parsed as the beginning of a nested name specifier. When the __is_void keyword is encountered, the scope specifier is no longer valid and the error diagnostic is emitted. The error message in your third example is changed after inclusion of headers because <type_traits> header activated the fallback mode and downgraded __is_void to an identifier. You can get it to compile as in your final example if you prefix __is_void with std:: In the last two examples you’ve activated the GNU header compatibility mode by attempting to introduce keywords as a type name in system headers. The appropriate extension warning will have been issued: There are any number of _is* compiler built-ins in clang and other compilers, and there will be more added in the new year so it’s a good plan to move away from using these intrinsic names as type names while we still have the keyword hack in the parser. Alp.

This sounds good to me, but I think it’ll be a bit more involved than that:

#if !__has_feature(is_void)

template struct __is_void : public false_type {};
template <> struct __is_void : public true_type {};

template struct _LIBCPP_TYPE_VIS_ONLY is_void
: public __is_void<typename remove_cv<_Tp>::type> {};

#else

template struct _LIBCPP_TYPE_VIS_ONLY is_void

: public integral_constant<bool, __is_void(typename remove_cv<_Tp>::type)>;

#endif // !__has_feature(is_void)

— Marshall

Hi Marshall, A feature definition was never added for these trait intrinsics unfortunately. I’ll look into it. This would still conflict with the keyword in current clang versions and other compilers that don’t have a quality __has_feature() implementation. To reiterate, you’ll do well to avoid the _is* prefix entirely regardless of whether you can feature-test for individual availability of the keywords. I’ll try to illustrate this in a different way, taking clang out of the picture for a minute. None of these structures extracted from libc++ will compile with MSVC 2013: It’s a fact of life that there many compiler-defined keywords littering the _is* namespace and most of them are undocumented. Trying to use them as identifiers or type names is a parse error. You /could/ try to continue using the _is* names that are known not to conflict today, but it’s a quixotic solution that’ll break every time a new compiler point-release adds new built-in trait primitives. This part will work. I’d actually recommend keeping the production-quality library implementations in libc++ instead of switching wholesale to the clang intrinsics. Many of our intrinsics are there for compatibility and haven’t been fully tested in a production environment. If you just s/_is/_libcpp_is/ to avoid the keywords you’ll gain compatibility with other modern compilers, silence the extension warnings and help us set a schedule for removing the GNU compatibility hack in the parser. After that we can start to look at offering optimized type traits by coordinating more closely, but it’s not the immediate issue at hand. Alp.

Hi Alp. libc++ was using __is_void first. clang needs to avoid stepping on existing identifiers too. Get a __has_feature going and we're happy to meet you half way. Neither clang nor libc++ has full ownership of the __namespace. We have to share it. It would be just as silly for us to be telling you to avoid this space because we're likely to step on you in the future.

Howard

And in the future please coordinate and test with libc++ prior to introducing breaking changes. We need to work together. Not reacting in this uncoordinated fashion.

Howard

This would still conflict with the keyword in current clang versions and other compilers that don't have a quality __has_feature() implementation.

To reiterate, you'll do well to avoid the __is_* prefix entirely regardless of whether you can feature-test for individual availability of the keywords.

Hi Alp. libc++ was using __is_void first. clang needs to avoid stepping on existing identifiers too. Get a __has_feature going and we're happy to meet you half way. Neither clang nor libc++ has full ownership of the __namespace. We have to share it. It would be just as silly for us to be telling you to avoid this space because we're likely to step on you in the future.

And in the future please coordinate and test with libc++ prior to introducing breaking changes. We need to work together. Not reacting in this uncoordinated fashion.

Hi Howard

What's the breaking change you're seeing?

Alp.

If there is no breakage I see no reason for libc++ to change.

Howard

How can we coordinate and test with you prior to introducing changes if you only see reason for libc++ to change after the breakage happens? Like you say, we need to work together.

This is a heads-up that your code may be relying on a non-standard clang C++ language extension which is a strict error in other compilers.

If you're asking others to to maintain that old parser workaround a little longer I think it's fair that you help set the schedule for its removal.

Alternatively, if you think a formal disambiguation rule might be worth investigating to place compiler type trait primitives in a separate namespace, we can work together to propose that too.

The one thing that doesn't make sense is to wait and do nothing until any proposed change goes into effect a year or two down the line.

Alp.

If there is no breakage I see no reason for libc++ to change.

The issue is partly that the C++ standard (17.6.4.3.2 [global.names]) says:

Certain sets of names and function signatures are always reserved to the implementation:
  — Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.

If you see libc++ as part of "the implementation" (i.e., there is one single LLVM system that libc++ is part of), then your use of double underscores is absolutely fine (at least technically, if not practically).

If, however, you want libc++ to be usable with other compilers (e.g., GCC), the double underscores don't really belong to you. A conforming C++ compiler could reject all identifiers that begin with double underscores in code that isn't part of the compiler's own implementation. (Of course, arguably a conforming C++ compiler might not let you swap in a new standard library at all.)

But in practical terms, there is a bigger problem -- across several system-level projects, there seems to be a belief in a kind of double underscore magical thinking that goes like this: “Worried about a name clash? Here's the fix! Prefix your identifier with a double underscore and it'll be all good!!”

Various libraries do this. Glibc thinks it's a system library and has the right to all names starting with double underscores. Hence problems stemming from their using identifiers reserved by non-GCC compilers, like __block which got taken by the blocks extension from Objective-C. And it's not just Glibc, other libraries like cygwin that feel "system"ish do it too.

If you want to reduce the risk, instead of using __x and __y, you should instead use _llvm_libcxx_x and _llvm_libcxx_y. It may feel like this is a mouthful, but it's much less risky.

__You __may __think __that __such __a __change __would __detract __from __readability, __but __that __is __probably __because __you __have __become __used __to __all __the __noise __of __double __underscores __and __stopped __noticing __it. __To __other __people, __it __is __already __pretty __noisy.

Given all the tooling the LLVM project has now, it would be simple enough to have a source-to-source transformation on the libc++ headers, with their being written without underscores (or maybe even leaving them unchanged) but transforming them into a form with a consistent prefix that you can feel like you have better ownership rights over.

But probably even if you make the change, the complexities in Clang's lexer/parser will have to remain. Even if you fix libc++, there will still be plenty of other libraries out there applying double-underscore magical thinking.

    M.E.O.

Hi Howard,

The change that actually 'broke' this was r130342, which was 2 years and 7 months ago, and introduced the Embarcadero __is_xxx keywords. From that point, those keywords were special (unfortunately, I might say, but I don't think much can be done about them). It was r196212 by Alp which implemented the new warning, which is apparently on by default.

That said, I think it is just a fact of life some compiler vendors will continue choosing their special keywords so they clash with internal identifiers; it is best to just attempt to work around those clashes. :slight_smile:

-Dimitry

The change that introduced the __is_void, etc into libc++ was r103490, which was 3 years and 7 months ago.
(if you want to play silly “this happened a long time ago” games)

— Marshall

This would still conflict with the keyword in current clang versions and other compilers that don't have a quality __has_feature() implementation.

To reiterate, you'll do well to avoid the __is_* prefix entirely regardless of whether you can feature-test for individual availability of the keywords.

Hi Alp. libc++ was using __is_void first. clang needs to avoid stepping on existing identifiers too. Get a __has_feature going and we're happy to meet you half way. Neither clang nor libc++ has full ownership of the __namespace. We have to share it. It would be just as silly for us to be telling you to avoid this space because we're likely to step on you in the future.

And in the future please coordinate and test with libc++ prior to introducing breaking changes. We need to work together. Not reacting in this uncoordinated fashion.

Hi Howard,

The change that actually 'broke' this was r130342, which was 2 years and 7 months ago, and introduced the Embarcadero __is_xxx keywords.

The change that introduced the __is_void, etc into libc++ was r103490, which was 3 years and 7 months ago.
(if you want to play silly “this happened a long time ago” games)

Marshall, Howard and Dimitry,

Let's all take a step back.

Regardless of decisions that were made some time ago the fact is we control the standard library and the compiler today so there's no mistake we're stuck with and nothing we can't fix together if we set our minds to it :slight_smile:

Considering libc++ to be part of the implementation is one way to look at it there will be no unilateral removal of the hack until we find a way forward. I will however nudge to get a resolution because we don't want to keep such hacks in the parser indefinitely lest third party code begins to rely on broken behaviour.

Please be mindful that that compiler has very little choice as to which names are used for type trait primitives. Double underscores are used in order to stay out of the way and so far we've only ever used names which that are already recognised as keywords in other compilers.

libc++ on the other has the freedom to use any identifier in the global namespace, or indeed to use a C++ private namespace where any name goes, potentially letting you use short private names without any prefix or underscores at all.

Alp.

If there is no breakage I see no reason for libc++ to change.

The issue is partly that the C++ standard (17.6.4.3.2 [global.names]) says:

Certain sets of names and function signatures are always reserved to the implementation:
  — Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use.

If you see libc++ as part of "the implementation" (i.e., there is one single LLVM system that libc++ is part of), then your use of double underscores is absolutely fine (at least technically, if not practically).

If, however, you want libc++ to be usable with other compilers (e.g., GCC), the double underscores don't really belong to you. A conforming C++ compiler could reject all identifiers that begin with double underscores in code that isn't part of the compiler's own implementation. (Of course, arguably a conforming C++ compiler might not let you swap in a new standard library at all.)

But in practical terms, there is a bigger problem -- across several system-level projects, there seems to be a belief in a kind of double underscore magical thinking that goes like this: “Worried about a name clash? Here's the fix! Prefix your identifier with a double underscore and it'll be all good!!”

Various libraries do this. Glibc thinks it's a system library and has the right to all names starting with double underscores. Hence problems stemming from their using identifiers reserved by non-GCC compilers, like __block which got taken by the blocks extension from Objective-C. And it's not just Glibc, other libraries like cygwin that feel "system"ish do it too.

If you want to reduce the risk, instead of using __x and __y, you should instead use _llvm_libcxx_x and _llvm_libcxx_y. It may feel like this is a mouthful, but it's much less risky.

It’s also non-conforming.

The user is perfectly within his rights to use _llvm_libcxx_x as an identifier or macro name (something like)

  #define _llvm_libcxx_x(x) do { if (x) {}} while(false);
  #include <type_traits>

  int main () { _llvm_libcxx_x(true); }

This would fail to compile, and a user who wrote this could file a defect report against libc++ (and they would be correct!)

As a standard library implementation, libc++ *has* to use reserved identifiers (single underscore followed by capital, double underscore) to avoid collisions with identifiers (and macros) that are defined by the user.

If the user wrote this instead:

  #define __is_x(x) do { if (x) {}} while(false);
  #include <type_traits>

  int main () { __is_x(true); }

I could correctly reject a bug report with “all identifiers beginning with __ are reserved. Don’t use them”.

Alp wrote:

libc++ on the other has the freedom to use any identifier in the global namespace, or indeed to use a C++ private namespace where any name goes, potentially letting you use short private names without any prefix or underscores at all.

This is an incorrect understanding of the situation.

— Marshall

My point was just that using a prefix was safer than using a common word or phrase without a prefix and somehow imagining that using a double underscore will insulate you from name clashes. I don't have a problem with using __llvm_cxx_blah, _LLVM_CXX_blah, llvm_cxx__blah or just _Lblah (the last of which takes no more space than __blah if that's really a concern).

What I'd like to see is a way to avoid everyone writing a "system library" thinking they're free to use _Thread, __block, __is_pod or similar. Ideally, I'd like a way that is somewhat robust in the face of future C/C++/Objective-C language extensions.

    M.E.O.

P.S. FWIW, this stack overflow post has good links to the rules about reserved identifiers in various standards (C++, C and POSIX): naming conventions - What are the rules about using an underscore in a C++ identifier? - Stack Overflow