Chrome/mac is all-clang, all-the-time

Hi,

I think I never sent an official announcement to this list, so here
goes: Starting with chrome 15, which was released this week, the
official chrome/mac binary is built with clang. We also use clang on
all our mac buildbots, and stopped supporting gcc 4.2. Chrome's
performance stayed the same after the switch (but many of the
performance numbers we measure are in v8-jitted code), and the
uncompressed binary size went down 10%.

The official chrome/linux is still built with gcc, and we don't intend
to change that, but we have a clang builder on linux as well, for
clang's diagnostics.

We do a build of trunk clang every friday, and use this to build the
development versions of chrome/mac. When we branch for a release, the
current plan is to create a branch in the clang repo as well and merge
critical clang fixes there if the need arises. For chrome 15, it looks
like we picked a revision that didn't need any fixes, so we didn't
create any branches so far.

Nico

ps: The binaries we use to build chrome are available at
http://commondatastorage.googleapis.com/chromium-browser-clang/index.html

That’s rather disappointing to read that you don’t intende to change the Linux build to Clang.

One would assume when Concurrency work is done in Clang that you’d make the switch. The more projects move to LLVM/Clang the less the number of dependencies arise, unlike the behemoth that has become the GCC Collection. The number of packages alone in Debian, Ubuntu, Redhat and others is just absurd.

If you offer clang built Linux versions for Debian with their FreeBSD project I’m sure the appreciation would be well received.

I’ll be pushing for GNOME to move more of it’s infrastructure to LLVM/Clang as well.

If Qt makes LLVM/Clang a first class citizen to build it’s frameworks it would go a long way for the KDE Project to make their infrastructure LLVM/Clang friendly.

Yet, differentiating Chrome from being built with clang formally on Linux and even Windows seems a more political solution than a technical one.

  • Marc

That's rather disappointing to read that you don't intende to change the Linux build to Clang.

Only very recently has clang managed to parse all of the recent libstdc++ headers, and libc++ support on linux isn't perfect yet, as far as I know (sorry if I am slightly out-of-date). Projects I work on at work have a similar policy (clang++ on mac, g++ on linux). Perhaps when 3.0 comes out, and is available on a few distributions, we may re-evaluate.

One would assume when Concurrency work is done in Clang that you'd make the switch. The more projects move to LLVM/Clang the less the number of dependencies arise, unlike the behemoth that has become the GCC Collection. The number of packages alone in Debian, Ubuntu, Redhat and others is just absurd.

I don't really understand what you mean here. If you are getting binaries then the number of dependencies doesn't matter. Also I would hope that systems on linux would simply move to a world where they use the compiler you have denoted, wether it be g++ or clang++. I certainly hope you aren't going to try to pursade individual open source projects to drop support for g++.

If you offer clang built Linux versions for Debian with their FreeBSD project I'm sure the appreciation would be well received.

Why? Debian builds software from scratch themselves. They can just set the compiler variable to g++ and clang++ as they see fit, and will.

Sorry for the lengthy reply, I just worried slightly about your tone. I want to write, and use, software which works equally well in all C++ compilers. Software which compiles only in one, wether it be clang or g++, should be fixed to work in both, if possible.

Chris

"Marc J. Driftmeyer" <mjd@reanimality.com>
writes:

I'll be pushing for GNOME to move more of it's infrastructure to
LLVM/Clang as well.

But why would they? What's in it for them?

Seriously, on the mac, it makes sense, as gcc is hamstrung by Apple's
policies there, and systems like FreeBSD where it serves a political
purpose are happy to do it -- but that isn't the case elsewhere, and
clang currently really doesn't offer much practical advantage for the
user.

-Miles

But it surely does for developer.

Here "user" (of the compiler) == "developer"

-miles

Don't you like fast compilation and relevant diagnostic messages?

Of course I do... :slight_smile:

But I use both, and my impression is that clang's advantages in these
areas are a bit overstated -- in my typical usage, clang seems "a
little" faster, but it's typically on the order of 10% or so (on C++
code with "medium" template usage), and in general, I've found nothing
particularly wrong with gcc's error messages (there are surely extreme
cases where they're confusing, but really that's true for any
compiler).

[I'm not dissing clang, I think it's a cool project. I also think
it's very healthy to have some real competition in the FOSS compiler
world!]

-miles

I just want to provide a different perspective. I’m not trying to start a deep argument about the virtues of different compilers, I just want to share my observations of Clang as it is used in practice by a reasonably large group of developers.

When we (Google) started experimenting with Clang, there were many with a perspective similar to yours. However, with over 6 months of having its diagnostics in front of thousands of developers, we’ve come to a different conclusion.

The overwhelming feedback from C++ developers has been that the diagnostics from Clang are significantly more clear and helpful. Clang explains how macros and templates were involved when a particular error is hit, and this is cited repeatedly as being a fundamental shift in the utility of diagnostics for C++ code.

Moreover, the way Clang structures its warnings (not errors) allows for greater precision and lower false positive rates. It doesn’t use the optimizer to produce analysis-based warnings, instead providing a more stable and source-based analysis framework for them. It has changed how warnings are perceived in our development community, and we now rely heavily on Clang to warn about dangerous programming constructs.

That said, while the feedback was overwhelmingly positive, there were definitely some who were less enthusiastic. Many of these people had worked with GCC for so long that they exhibited strong change aversion. The messages from GCC are very familiar, and map to an existing set of problem descriptions for these people. They faced a learning curve when the messages changed at all, and that was costly. Interestingly, for a surprising number of people in this bucket, after a few months of using Clang, they were reluctant to switch back. They had slowly noticed and started using several common elements of Clang’s diagnostics (such as typo-correction and macro backtraces) without even realizing it. When they looked at GCC’s messages, they didn’t have the information they wanted to understand the problem.

Naturally, YMMV, but this has been our experience with Clang’s diagnostics. As a consequence, we take a very active role in maintaining and improving them.

Yeesh, could you get any more condescending?

-miles

I’m really sorry that came off as condescending, it wasn’t meant to be. The cost imposed by switching to Clang’s error messages was one we took very seriously. Working with GCC for a long time isn’t a negative statement about the developer, it’s a simple reality given the long history GCC has in the open source community as essentially the only compiler option available.

When evaluating Clang’s error messages the threshold wasn’t just being better based on the average feedback than GCC’s messages; the improvement had to be sufficiently large to justify paying for the learning curve. There was a lot of debate, and a lot of things had to be fixed in Clang’s diagnostics before it was clear that the cost/benefit tradeoff here supported transitioning to Clang’s diagnostics across the board.

I don't think that's condescending, I think it a pretty accurate description and something that developers familiar with one compiler will always find when they move to another. Many years ago, I used Microsoft's (approximation of) C++ compiler, and when I switched to using GCC I found it took a while to get used to the different error messages. Not because they were more or less expressive, but because I subconsciously mapped specific error messages to specific patterns of changes in my code. When I saw one error, I immediately fixed it by making a change, without even reading the error all the way through. After a few months of using GCC, I developed the same habits.

Some years later, when I switched to clang, I found the same thing. GCC would say 'unexpected something after whatever' and I know that this mean 'idiot, you forgot a semicolon'. When I started using clang, the error message actually told me that I missed a semicolon, but I had to actually read the entire message because it wasn't the pattern of text that I was expecting.

After using clang for a while, I find that I was starting to rely on its error messages. Back when clang's C++ codegen was largely nonfunctional, I was running g++, getting incomprehensible error messages, running clang++, fixing the errors it showed, and then compiling with g++ (which, in spite of the relative poor quality of the errors, had the advantage that it did generate a working binary - something that is generally considered beneficial in a compiler).

Whatever the compiler, after you've been using it for a while, you don't read the common error messages in detail, you look at the vague shape and fix the problem automatically, without thinking, because you've seen that error message a lot before and you know what kind of typo causes it. This isn't condescending, it's the kind of automatic behaviour that any HCI textbook will tell you about in chapter 1.

David

Er, well you basically divided your user base into two categories:
those who eagerly embraced clang, and those who were too
dim/rigid/inexperienced to appreciate it -- with what seems to be the
implication that anybody in the former group _must_ be in the latter.
That's veering pretty close to True Scotsman territory...

While I'm sure both types of user are present, I suspect that you're
omitting another group: those who don't really care so much either
way -- especially amongst those whose compiler usage doesn't usually
tickle the particularly egregious cases (e.g. C developers -- like
Gnome! -- and C++ devs who aren't pushing boundaries with templates),
and developers who are experienced enough (as I imagine most google
devs to be!) that they aren't particularly bothered by the specific
wording of error messages...

Again, I don't want to seem like I'm speaking _against_ clang -- I'm
not, I use it, and I like it -- and clang certainly does generally
have pretty clear error messages. But there seems to be this idea
floating around that gcc's error messages are unusably bad, and that
clang's are in a completely different class, and _that_ I just don't
see.

[I should note, btw, that gcc is developed too -- and that includes
improving the error messages, and compilation speed...]

-miles

Hello,

I think you misunderstood Chandler’s statement here. From the way I read it (but then, I am a French speaker first and foremost…) it seemed to me that some people were rightly concerned about the lack of productivity that would result from having to learn the “clang” way. I think it’s a valid concern.

Personally, I have been using Clang at home for a while, and gcc at work (company policy) and I will fully admit that despite my real interest in Clang and its definitely better diagnosis, I am simply more used to gcc’s diagnostics and thus more adept at deciphering them, no matter how arcane they may appear to a beginner…

Another interesting point though, for the switch, is the performance of the resulting binary. Development is one thing, and we have a lot of tools at our disposal to help out: compiling with both Clang and gcc with warnings on certainly help catch a lot of errors, static analysis is quite useful as well, debug builds etc… However when pushing software to a server, we still want to heck out as much speed and as few memory as we can from it.

As far as I know, gcc still has the lead here (but then the only serious benchmarks I saw were from phoronix, and it was a while ago). I seem to remember that LLVM was more adept for numerical computation, but it’s of little interest to me (and my company). If someone had accurate figures Clang 3.0 / gcc 4.7, it would be interesting to see how it falls out now.

–Matthieu

2011/11/1 Matthieu Monrocq <matthieu.monrocq@gmail.com>

2011/11/1 Chandler Carruth <chandlerc@google.com>:

I’m really sorry that came off as condescending, it wasn’t meant to be. The
cost imposed by switching to Clang’s error messages was one we took very
seriously. Working with GCC for a long time isn’t a negative statement about
the developer, it’s a simple reality given the long history GCC has in the
open source community as essentially the only compiler option available.

Er, well you basically divided your user base into two categories:
those who eagerly embraced clang, and those who were too
dim/rigid/inexperienced to appreciate it – with what seems to be the
implication that anybody in the former group must be in the latter.
That’s veering pretty close to True Scotsman territory…

While I’m sure both types of user are present, I suspect that you’re
omitting another group: those who don’t really care so much either
way – especially amongst those whose compiler usage doesn’t usually
tickle the particularly egregious cases (e.g. C developers – like
Gnome! – and C++ devs who aren’t pushing boundaries with templates),
and developers who are experienced enough (as I imagine most google
devs to be!) that they aren’t particularly bothered by the specific
wording of error messages…

Again, I don’t want to seem like I’m speaking against clang – I’m
not, I use it, and I like it – and clang certainly does generally
have pretty clear error messages. But there seems to be this idea
floating around that gcc’s error messages are unusably bad, and that
clang’s are in a completely different class, and that I just don’t
see.

[I should note, btw, that gcc is developed too – and that includes improving the error messages, and compilation speed…]

-miles


Cat is power. Cat is peace.

Hello,

I think you misunderstood Chandler’s statement here. From the way I read it (but then, I am a French speaker first and foremost…) it seemed to me that some people were rightly concerned about the lack of productivity that would result from having to learn the “clang” way. I think it’s a valid concern.

Personally, I have been using Clang at home for a while, and gcc at work (company policy) and I will fully admit that despite my real interest in Clang and its definitely better diagnosis, I am simply more used to gcc’s diagnostics and thus more adept at deciphering them, no matter how arcane they may appear to a beginner…

Another interesting point though, for the switch, is the performance of the resulting binary. Development is one thing, and we have a lot of tools at our disposal to help out: compiling with both Clang and gcc with warnings on certainly help catch a lot of errors, static analysis is quite useful as well, debug builds etc… However when pushing software to a server, we still want to heck out as much speed and as few memory as we can from it.

As far as I know, gcc still has the lead here (but then the only serious benchmarks I saw were from phoronix, and it was a while ago). I seem to remember that LLVM was more adept for numerical computation, but it’s of little interest to me (and my company). If someone had accurate figures Clang 3.0 / gcc 4.7, it would be interesting to see how it falls out now.

+1: The only “up-to-date” comparisons one can find are Mac GCC 4.2.1 vs current LLVM/Clang, but never really trustable comparisons vs more recent GCC versions. It would also be very informative and perhaps have the most value as a “reality-check” if the by now dated comparisons here: http://clang.llvm.org/features.html#performance could be updated to compare current GCC vs current Clang. I’m expecting the difference to be reduced (GCC has improved lot since 4.2(!)… duh!). Running the exact same benchmarks and comparing Clang’s and GCC’s progress would be very good to know and perhaps point out any (if there are any) weaknesses in Clang’s performance.

Ruben

Matthieu Monrocq wrote:

Another interesting point though, for the switch, is the performance of the
resulting binary. Development is one thing, and we have a lot of tools at
our disposal to help out: compiling with both Clang and gcc with warnings
on certainly help catch a lot of errors, static analysis is quite useful as
well, debug builds etc... However when pushing software to a server, we
still want to heck out as much speed and as few memory as we can from it.

As far as I know, gcc still has the lead here (but then the only serious
benchmarks I saw were from phoronix, and it was a while ago). I seem to
remember that LLVM was more adept for numerical computation, but it's of
little interest to me (and my company). If someone had accurate figures
Clang 3.0 / gcc 4.7, it would be interesting to see how it falls out now.

Not an exhaustive test I know, but a month or two ago I did a comparison between clang ToT and gcc 4.5.2 on my home machine (AMD Athlon 7750 dual-core, Linux) on an application where I'm keen to get the best possible performance (a Verilog simulator). Compile times were almost the same (clang had a very slight edge), but for run times, the clang generated binary was approximately 10% slower.

Martin