How to control C++0x adoption in a large codebase?

Hi clang folks,

I'm working on migrating a certain large codebase to C++0x, and we're
looking for a way to turn on C++0x mode but make sure people don't use
C++0x features before we're ready for them.

* We need to turn on -std=gnu++0x for the whole codebase at once
because C++98 and C++0x have different ABIs.
* Some of our code needs to be portable to a bunch of compilers we
don't control, so it needs to be as C++98-compatible as we can make
it.
* Other parts of the codebase can adopt C++0x features as we make sure
they work with our compilers and don't have hidden pitfalls.

I've attached a patch as one possible way to do this, but I'm
certainly not wedded to this particular strategy. The patch adds a
-Wc++98-compat flag, and demonstrates a plan to add individual
-Wc++98-compat-some-feature flags for each discrete C++0x feature or
change.

There are pros and cons to this strategy:
* This makes it easy to control the use of C++0x at a per-file
granularity. Files that need to be completely C++98-compatible can
pass -Wc++98-compat, while files that just need to use tested features
can turn individual ones back on with
-Wno-c++98-compat-initializer-lists.
* If every single mention of LangOpts.CPlusPlus0x in the codebase
triggers a need for an additional warning, that's about 200 extra
warnings (`git grep CPlusPlus0x|wc`). It'll probably be somewhat less
given that the attached patch covers 3 uses of CPlusPlus0x with one
new warning, but there are still features to implement that may
increase the number.
* For features that are also extensions to C++98, it's either a
duplicate warning, or with some extra work, maybe comes along for the
ride.

Clearly we'll never be able to do this perfectly, but if we can get
close, that'd dramatically reduce the build breaks that someone has to
track down and fix.

Thoughts? Suggestions? Alternatives?

Jeffrey

cxx98compat.patch (4.2 KB)

As far as I'm aware, this is not really true, and I actually put a lot of
effort into fighting a few committee proposals that would have forced
ABI changes. There are a few things like name mangling which have been
changed/clarified in the Itanium ABI, and obviously the standard library
has to export more symbols in '0x, but I don't know of well-formed '03
programs that have changed.

Now, obviously you can break binary compatibility with macro
metaprogramming, but that's quite different.

Anyway, I'm not opposed to an extension warning for using '0x features.

John.

I'm working on migrating a certain large codebase to C++0x, and we're
looking for a way to turn on C++0x mode but make sure people don't use
C++0x features before we're ready for them.

* We need to turn on -std=gnu++0x for the whole codebase at once
because C++98 and C++0x have different ABIs.

As far as I'm aware, this is not really true, and I actually put a lot of
effort into fighting a few committee proposals that would have forced
ABI changes. There are a few things like name mangling which have been
changed/clarified in the Itanium ABI, and obviously the standard library
has to export more symbols in '0x, but I don't know of well-formed '03
programs that have changed.

45093 – Different definitions of _Rb_tree::{erase,_M_destroy_node} between C++98 and C++0x concludes that the libstdc++ maintainers
don't intend to maintain binary compatibility between the two language
versions. It certainly may be that it would have been possible to
maintain compatibility, but they didn't, so that's what we have to
live with. libc++ probably does this better, but we don't really want
to block constrained C++0x support on switching to that either.

Now, obviously you can break binary compatibility with macro
metaprogramming, but that's quite different.

Anyway, I'm not opposed to an extension warning for using '0x features.

Thanks. :slight_smile:

Okay. Just wanted to make sure you weren't talking about some
core-language thing I'd missed. I believe Howard has, in fact, designed
libc++ with cross-dialect binary compatibility in mind, although obviously
he's been assisted in that by being able to design with foreknowledge
of C++0x's requirements.

John.

libc++ uses C++11's inline namespace to version the ABI between gcc's C++03 std::lib and libc++. This translates to a /noisy/ ABI incompatibility as gcc::std::vector and libc++::std::vector have different manglings (you get your errors at link time instead of run time). Although some ABI's have been maintained as compatible: std-defined exception classes, operator new/delete, handler functions.

Howard

By "cross-dialect" I meant C++03 vs. C++11. Assuming that the libc++ library itself is compiled in C++11 mode, C++03 code using libc++ should interoperate with C++11 code using libc++, correct?

John.

Yes, that is correct.

Howard

Hi clang folks,

I'm working on migrating a certain large codebase to C++0x, and we're
looking for a way to turn on C++0x mode but make sure people don't use
C++0x features before we're ready for them.

* We need to turn on -std=gnu++0x for the whole codebase at once
because C++98 and C++0x have different ABIs.

Summarizing side discussion: libstdc++ has a different ABI in C++98 vs. in C++11, libc++ does not.

* Some of our code needs to be portable to a bunch of compilers we
don't control, so it needs to be as C++98-compatible as we can make
it.

And you can't tell libstdc++ to use the C++0x ABI even in C++98 mode?

* Other parts of the codebase can adopt C++0x features as we make sure
they work with our compilers and don't have hidden pitfalls.

I've attached a patch as one possible way to do this, but I'm
certainly not wedded to this particular strategy. The patch adds a
-Wc++98-compat flag, and demonstrates a plan to add individual
-Wc++98-compat-some-feature flags for each discrete C++0x feature or
change.

There are pros and cons to this strategy:
* This makes it easy to control the use of C++0x at a per-file
granularity. Files that need to be completely C++98-compatible can
pass -Wc++98-compat, while files that just need to use tested features
can turn individual ones back on with
-Wno-c++98-compat-initializer-lists.
* If every single mention of LangOpts.CPlusPlus0x in the codebase
triggers a need for an additional warning, that's about 200 extra
warnings (`git grep CPlusPlus0x|wc`). It'll probably be somewhat less
given that the attached patch covers 3 uses of CPlusPlus0x with one
new warning, but there are still features to implement that may
increase the number.
* For features that are also extensions to C++98, it's either a
duplicate warning, or with some extra work, maybe comes along for the
ride.

Clearly we'll never be able to do this perfectly, but if we can get
close, that'd dramatically reduce the build breaks that someone has to
track down and fix.

Thoughts? Suggestions? Alternatives?

This seems more like you're enforcing a coding style ("use only the C++11 features that we like") rather than providing a generally useful feature. I don't think we should be using the warning mechanism to enforce a coding style, for a number of reasons:

  - There are many coding styles in the world. Which ones do we pick? Just those that we as a community like (how do we decide that?), or all of those for which people submit patches? Is either of these sustainable in the long term?

  - Experience has shown that warnings that don't default to "on" tend to be broken over time. Yes, of these warnings are particularly are easy to implement, but that doesn't mean that they won't get broken or missed.

Style checking belongs in a plug-in. That way, different organizations can provide their own style checkers that run along with their builds without forcing the union of all styles into the mainline Clang front-end. Better yet, maybe someone will build a configurable style checker as a plugin, to save the effort of everyone having to implement their own plugin separately.

I'd like to hear more opinions on whether others consider the proposed warnings to be coding style enforcement, or whether they are generally useful. It's clearly not as obvious as if we were proposing -Wbraces-on-new-line or -Wmethods-names-start-with-a-verb.

  - Doug

Style checking belongs in a plug-in. That way, different organizations can provide their own style checkers that run along with their builds without forcing the union of all styles into the mainline Clang front-end. Better yet, maybe someone will build a configurable style checker as a plugin, to save the effort of everyone having to implement their own plugin separately.

This is certainly something I’d like to get going if/when I get the time/familiarity (or if someone beats me to it, I’d be more than willing to contribute). Style checker & auto-fixer, of course (astyle with teeth?).

I’d like to hear more opinions on whether others consider the proposed warnings to be coding style enforcement, or whether they are generally useful.

I tend to agree - I think the right feature for the compiler here is instead to make it easy to ensure that while you continue to write C++03 you don’t write it in such a way that it won’t be sensible C++11 and I believe Clang already has some such warnings in C++03 mode about use of identifiers or features that might cause problems if the code were compiled as C++11 (correct me if I’m wrong there - if Clang doesn’t have such behavior I think it would be a reasonable thing to add). Rather than supporting a strange (& as you point out, off by default) mode of compiling C++11 without any C++11 features. By supporting warnings in C++03 mode it’s also less likely that it will be construed as complete/accurate support - the warnings are a best effort to ease your portability up to C++11, but nothing, short of compiling (& testing) the code in both modes, is going to get you all the way there.

  • David

Hi clang folks,

I’m working on migrating a certain large codebase to C++0x, and we’re
looking for a way to turn on C++0x mode but make sure people don’t use
C++0x features before we’re ready for them.

  • We need to turn on -std=gnu++0x for the whole codebase at once
    because C++98 and C++0x have different ABIs.

Summarizing side discussion: libstdc++ has a different ABI in C++98 vs. in C++11, libc++ does not.

Just to clarify why I think this is relevant even though libc++ avoided the problem: I would like to preserve the ability to use Clang productively in C++11 mode with libstdc++; doing so appears to require a codebase-wide switch due to ABIs.

  • Some of our code needs to be portable to a bunch of compilers we
    don’t control, so it needs to be as C++98-compatible as we can make
    it.

And you can’t tell libstdc++ to use the C++0x ABI even in C++98 mode?

I’ll let Jeffrey comment here, but I believe we looked into this and it was very non-trivial… I’d be interested in his thoughts here.

This seems more like you’re enforcing a coding style (“use only the C++11 features that we like”) rather than providing a generally useful feature. I don’t think we should be using the warning mechanism to enforce a coding style, for a number of reasons:

I really really don’t want to restart this discussion. I actually think we’re in agreement here. If it comes down to style, the compiler has no business doing this in a warning. However, I’m not yet convinced this comes down to style…

I’d like to hear more opinions on whether others consider the proposed warnings to be coding style enforcement, or whether they are generally useful. It’s clearly not as obvious as if we were proposing -Wbraces-on-new-line or -Wmethods-names-start-with-a-verb.

Emphatically agree on the last sentence here. If that’s were this ends up, it doesn’t belong.

Ok, so why don’t I think this is a style issue? First, let’s separate two aspects of the discussion which may simplify things.

Consider just a single warning flag: “-Wc++98-compat”, designed for use in “-std=c++11” mode. The goal is to warn about obvious code patterns that break backwards compatibility. Why is this flag useful from a strictly functional perspective? Because developers may need to have non-trivial amounts of code compiled in both C++98 and C++11 contexts. Imagine some libraries are expected to work with projects whose compilers do not yet support C++11, and projects which actively use C++11. Because of libstdc++ issues, all of the code may need to be compiled as C++11 in the latter case. Even without this, headers will leak across the TU boundary and thus be compiled in both modes.

I think getting into this situation is inevitable for both open source and corporate codebases. That’s one of the primary reasons developers will use “-Wc++11-compat” when in “-std=c++98” mode. Not offerring the complementary set of warnings means that those developers or users working with C++11 builds must (if they modify the library) constantly build twice to ensure they haven’t introduced an incompatibility. Sure, “-Wc++98-compat” may never be perfect (any more than “-Wc++11-compat” is perfect), but it simplifies the developer experience when there is some external need or pressure to have a body of code which is valid in both modes.

This argument may be invalid, there may be reasons why it’s not worth doing (although unless other problems / options arise, it seems to be worth our time at Google), but I don’t this is style related.

Now, a somewhat separate issue is the detailed warnings for individual features of C++11. I have some vague ideas as to why these may be functionally necessary for codebases migrating from C++98 → C++11, or ways that they might ease such a migration. But I’ll be the first (ok, the second?) to say that they both introduce a lot more concerning questions (does C++11 make sense sans features X, Y, and Z? I don’t know…) and are much closer to style issues, so maybe we can start by discussing nothing more than a single “-Wc++98-compat” flag. No selection, no preferences, just a compatibility warning.

Chandler Carruth <chandlerc@google.com>
writes:

Summarizing side discussion: libstdc++ has a different ABI in C++98 vs. in
C++11, libc++ does not.

Just to clarify why I think this is relevant even though libc++
avoided the problem: I would like to preserve the ability to use
Clang productively in C++11 mode with libstdc++; doing so appears to
require a codebase-wide switch due to ABIs.

...

And you can't tell libstdc++ to use the C++0x ABI even in C++98 mode?

I'll let Jeffrey comment here, but I believe we looked into this and it was
very non-trivial... I'd be interested in his thoughts here.

FWIW, there's been some discussion of this ABI-incompatibility
recently on the gcc development mailing, although AFAIK, it's more a
libstdc++ issue.

See: James Y Knight - Long-term plan for C++98/C++11 incompatibility
("Long-term plan for C++98/C++11 incompatibility")

-Miles

Consider just a single warning flag: “-Wc++98-compat”, designed for use in “-std=c++11” mode. The goal is to warn about obvious code patterns that break backwards compatibility. Why is this flag useful from a strictly functional perspective? Because developers may need to have non-trivial amounts of code compiled in both C++98 and C++11 contexts. Imagine some libraries are expected to work with projects whose compilers do not yet support C++11, and projects which actively use C++11. Because of libstdc++ issues, all of the code may need to be compiled as C++11 in the latter case. Even without this, headers will leak across the TU boundary and thus be compiled in both modes.

In this case, the library authors would maintain & develop in C++98, presumably - relying on forward compatibility warnings (-Wc++11-compat) to help guide them. Then for release they’d build both 98 and 11 versions of their binaries. Targeting your LCD makes a fair bit of sense to me.

Is there a particular case where one would be compelled to work in C++11 when developing and maintaining a library designed for backwards compatibility? I suppose the use case you’re describing is where the library author is actively consuming the library in a C++11 context & tweaking the C++98 library day-to-day as they work on their C++11 consumer of that library. Possible, if somewhat inefficient, perhaps (that their build system is actively building both libs all the time if the C++98/11 lib is such a commonly used/general tool - I’d expect it to be built authoritatively, perhaps)

Not to say that I don’t see your point - certainly that scenario is possible & the developer working on a C++11 library that depended on a backwards compatible C++98 library that was compiling as C++11 in his/her project could benefit from such warnings being turned on for those files.

This argument may be invalid, there may be reasons why it’s not worth doing (although unless other problems / options arise, it seems to be worth our time at Google), but I don’t this is style related.

Not exactly ‘style’, no. Perhaps, let’s say, “situational”, and I think that’s what Doug’s getting at - if it’s not on-by-default, clang developers aren’t really paying much attention to it (I’m not sure I necessarily agree with this method - but it’s possibly a reality I can’t change). The idea being that people who have particular interest in particular situational warnings would be better off maintaining a tool designed for them so they don’t get drowned out by the larger issues of just getting the clang compiler to do its main job. (not to mention all that extra situational stuff adds a burden to clang developers trying to do the hard job of writing a C++ compiler)

Now, a somewhat separate issue is the detailed warnings for individual features of C++11. I have some vague ideas as to why these may be functionally necessary for codebases migrating from C++98 → C++11, or ways that they might ease such a migration. But I’ll be the first (ok, the second?) to say that they both introduce a lot more concerning questions (does C++11 make sense sans features X, Y, and Z? I don’t know…) and are much closer to style issues

I imagine these are real issues for cross platform code. For example: I wouldn’t mind if LLVM could adopt the use of ‘nullptr’, possibly the simplest of C++11 features. This would maintain portability across all practical compilers that I know of without getting into the realm of some of the less consistently implemented C++11 features at this time. (I don’t cite this as a totally practical example, but at least one use case)

But that’s a different scenario from the previous one, yes.

Consider just a single warning flag: “-Wc++98-compat”, designed for use in “-std=c++11” mode. The goal is to warn about obvious code patterns that break backwards compatibility. Why is this flag useful from a strictly functional perspective? Because developers may need to have non-trivial amounts of code compiled in both C++98 and C++11 contexts. Imagine some libraries are expected to work with projects whose compilers do not yet support C++11, and projects which actively use C++11. Because of libstdc++ issues, all of the code may need to be compiled as C++11 in the latter case. Even without this, headers will leak across the TU boundary and thus be compiled in both modes.

In this case, the library authors would maintain & develop in C++98, presumably - relying on forward compatibility warnings (-Wc++11-compat) to help guide them. Then for release they’d build both 98 and 11 versions of their binaries. Targeting your LCD makes a fair bit of sense to me.

Is there a particular case where one would be compelled to work in C++11 when developing and maintaining a library designed for backwards compatibility? I suppose the use case you’re describing is where the library author is actively consuming the library in a C++11 context & tweaking the C++98 library day-to-day as they work on their C++11 consumer of that library. Possible, if somewhat inefficient, perhaps (that their build system is actively building both libs all the time if the C++98/11 lib is such a commonly used/general tool - I’d expect it to be built authoritatively, perhaps)

I don’t think your conclusion follows from your statement of my use case… Let me try to more concretely describe the practices and patterns I’m seeing, and would like to support:

Project X develops some utility code. Eventually it becomes useful, and so it gets factored into a library. I’ve seen lots of OSS compression, protocol, and toolkit libraries start this way here at Google (Snappy, Protobuf, Omaha, …). The project that initially used it is still the primary maintainer and consumer, but they grow additional users. Now if that project starts using C++11 actively, they face (in my view) an unfortunate choice of maintenance burdens:

a) Separate development of features in the library from the development of consuming use cases s.t. the library development activity can use a different build / test / release model, or
b) Develop the library in C++11, but for each release, run it through C++98 builds to make sure it still works for other consumers of the library.

In case (a), essentially they have to double their builds. I think that’s really unfortunate and an undue burden to impose, but maybe others disagree. In case (b), they have a really slow feedback cycle in learning about issues in the code they’ve written. I think that’s also bad.

How do we solve it? My proposal is the same way we have solved it for projects in the reversed position, with some consumers using C++11 but not the primary maintainers: we give them a compatibility warning flag.

Perhaps this is just an edge use case, but I’d like to point out that I foresee Clang itself getting into this situation. Imagine if the static analyzer becomes a nicely separated project, and begins using C++11 features heavily. Many Clang developers, actively contributing to both the core and the analyzer, would likely choose to build in C++11 mode. How frustrating would it be to catch in the build bots when we accidentally used ‘auto’ in the wrong file?

This argument may be invalid, there may be reasons why it’s not worth doing (although unless other problems / options arise, it seems to be worth our time at Google), but I don’t this is style related.

Not exactly ‘style’, no. Perhaps, let’s say, “situational”, and I think that’s what Doug’s getting at - if it’s not on-by-default, clang developers aren’t really paying much attention to it (I’m not sure I necessarily agree with this method - but it’s possibly a reality I can’t change).

I think that this argument is much weaker, and indeed we have many “situational” warnings in Clang today. Some key differences to me:

  • Style is elected by groups and subject to change rapidly with that group. As such, introducing it into the compiler couples an external set of requirements with the internals of the compiler
  • Style is more highly variable among users than the situation buckets we’re looking at, as we only address situations impacting multiple important consumers of Clang
  • Style has lower consequences of a violation, and so providing style checks in the compiler is low bang-for-buck overall (you don’t crash when you violate style)

In contrast, let’s look at a clearly situational warning: -Wthread-safety

  • Many systems necessitate concurrent, multithreaded programming
  • The concerns of thread-safety addressed by the warning are not domain specific, but theoretical/inherent in any multithreaded application
  • Concurrency is something many users deal with, and they all deal with the same situation: multiple threads sharing memory
  • Concurrency bugs yield incorrect programs, crashes, data loss, you name it

The other interesting aspect of warnings which fit into these well-defined situations is that we can reach out to experts in that particular domain to ensure what Clang does is sound and applicable across users. That clearly isn’t the case for style checking, but I think it also highlights why language compatibility warnings should be in the compiler or no where: who else can possibly get this correct? The compiler and its authors are the experts here.

We already have this today, though on a small scale. People using MSVC
(I use it and clang/linux) have breaks in both directions (llvm::next
vs std::next, which breaks MSVC and someone checking in auto would
break c++98). I basically end up building both and testing both. So, I
suppose I'm already voting with my feet, even if my case isn't
typical.

In any case, the warning won't solve library issues, like using
next(std::vector) in the clang namespace without qualifying llvm::,
because it's found using adl. There are likely to be lots of issues
that the warnings couldn't hope to cover. I'm not against the
warnings, but ultimately, if you aren't building with C++98, you can't
be sure that it works there.

Agreed. This seems like a lot of code in Clang to work around
what's obviously a major libstdc++ issue. I do understand their
dilemma, but it's going to be a huge problem for a lot of people,
and "if you care about the old ABI, you can never enable C++11"
is not a reasonable answer.

It is possible to design a library which upgrades gracefully — they
can't reasonably make std::list::size() O(1) while maintaining the
ABI, but they can still implement emplace() if C++11 is enabled.
Alternatively, for clients like Google that don't care about breaking
ABI, they could make the C++11 implementation downgrade
gracefully like libc++ does. But this should be solved by libstdc++
instead of hacking compilers around a situation that wouldn't
happen if libstdc++ had an appropriate solution.

John.

Hi clang folks,

I’m working on migrating a certain large codebase to C++0x, and we’re
looking for a way to turn on C++0x mode but make sure people don’t use
C++0x features before we’re ready for them.

  • We need to turn on -std=gnu++0x for the whole codebase at once
    because C++98 and C++0x have different ABIs.

Summarizing side discussion: libstdc++ has a different ABI in C++98 vs. in C++11, libc++ does not.

Just to clarify why I think this is relevant even though libc++ avoided the problem: I would like to preserve the ability to use Clang productively in C++11 mode with libstdc++; doing so appears to require a codebase-wide switch due to ABIs.

I fully understand them wanting to use C++11 as a reason to break the libstdc++ ABI (and switch to more efficient data structures), but it’s rather unfortunate that they didn’t make it easy to use the new ABI with C++98/03 code. Oh, well; it’s this way now and we’ll have to deal with it.

I’d like to hear more opinions on whether others consider the proposed warnings to be coding style enforcement, or whether they are generally useful. It’s clearly not as obvious as if we were proposing -Wbraces-on-new-line or -Wmethods-names-start-with-a-verb.

Emphatically agree on the last sentence here. If that’s were this ends up, it doesn’t belong.

Ok, so why don’t I think this is a style issue? First, let’s separate two aspects of the discussion which may simplify things.

Consider just a single warning flag: “-Wc++98-compat”, designed for use in “-std=c++11” mode. The goal is to warn about obvious code patterns that break backwards compatibility. Why is this flag useful from a strictly functional perspective? Because developers may need to have non-trivial amounts of code compiled in both C++98 and C++11 contexts. Imagine some libraries are expected to work with projects whose compilers do not yet support C++11, and projects which actively use C++11. Because of libstdc++ issues, all of the code may need to be compiled as C++11 in the latter case. Even without this, headers will leak across the TU boundary and thus be compiled in both modes.

I think getting into this situation is inevitable for both open source and corporate codebases. That’s one of the primary reasons developers will use “-Wc++11-compat” when in “-std=c++98” mode. Not offerring the complementary set of warnings means that those developers or users working with C++11 builds must (if they modify the library) constantly build twice to ensure they haven’t introduced an incompatibility. Sure, “-Wc++98-compat” may never be perfect (any more than “-Wc++11-compat” is perfect), but it simplifies the developer experience when there is some external need or pressure to have a body of code which is valid in both modes.

I guess I have always viewed building as C++98/03 with -Wc++11-compat as the “proper” way to develop dual C++98/11 code bases, because you’re certain that it compiles as C++98/03 and you get a rap on the knuckles if you stray into C++11. It says, “I know I’m targeting C++98/03, but I want to be ready for the future.”

Having things the other way—build as C++11 but don’t use its features at all—seems strange to me. It’s saying “I’m targeting C++11 but I can’t use any of the features of C++11”. But, I understand better now why this comes up with libstdc++… and I can see how others will hit it.

Now, a somewhat separate issue is the detailed warnings for individual features of C++11. I have some vague ideas as to why these may be functionally necessary for codebases migrating from C++98 → C++11, or ways that they might ease such a migration. But I’ll be the first (ok, the second?) to say that they both introduce a lot more concerning questions (does C++11 make sense sans features X, Y, and Z? I don’t know…) and are much closer to style issues, so maybe we can start by discussing nothing more than a single “-Wc++98-compat” flag. No selection, no preferences, just a compatibility warning.

You’ve convinced me that -Wc++98-compat is necessary and makes sense; that’s it’s not an issue of style, but is an important use case. Thanks for clarifying.

I’m much more dubious of the detailed warning flags for individual features, because that’s leaning toward letting users pick their own subset of the language, and (to me) that’s far more of a coding convention issue.

  • Doug