Results of the FreeBSD Ports build with CC=clang

Hello all,

Because we at FreeBSD have a source package manager called Ports with
over 20000 packages, we thought it would be nice to waste some CPU
cycles on the build cluster to see how well Clang performs.

Unfortunately it was only capable of building 7030 packages
successfully. This sounds very bad, but when you keep in mind that it
wasn't able to build libiconv (on which 9600 other ports depend directly
or indirectly, mainly because of GNU make), that's still quite
impressive.

The build cluster only attempted to build 7241 ports, of which 211
failed. The error logs can be viewed here:

  http://pointyhat.freebsd.org/errorlogs/i386-8-exp-latest/

After some filtering and sorting, I've come up with this list, where
I've categorized all error logs:

  http://80386.nl/pub/clang-portsbuild.txt

I'll discuss some of the categories here:

GNU89:
  A lot of ports currently don't build, because they depend on
  C89, or at least GCC's `-std=gnu89' semantics. Most of them have
  missing symbols, because they use GNU-style inlining, while
  Clang uses ISO C99 style inlining by default (because it uses
  -std=gnu99).

  We also have one breakage here, because the code in question
  uses a variable called `restrict'.

  This is something we'll probably change in the Ports framework,
  where we can mark ports that require a different -std= to be
  added to the CFLAGS.

libm:
  On the build cluster we also had a problem with autoconf's way
  of detecting math functions in libm. We've not been able to
  reproduce them outside of the build cluster yet... It may be
  similar to PR4290, where Clang dislikes autoconf's way of
  building conftests for functions in <stdio.h>. Maybe it has
  already been fixed in the mean time.

Once these issues have been sorted out, I'm sure we can get better
coverage of the Ports tree and hopefully be able to try to build at
least 15000+ packages.

So Clang is doing quite a good job already, but we're not there yet.
We'll keep submitting bug reports until we're satisfied. :wink:

I'd like to thank Erwin Lansing for performing the build!

Some of the categories seem off:

iaxmodem-1.2.0.log is actually failing because we don't implicitly
define sin/cos/etc.

cdecl-2.5.log looks like a gnu89 issue.

ipsec-tools-0.7.2.log and physfs-2.0.0.log are -Werror issues.

For libbgpdump-1.4.99.8.log, it would be more accurate to say that the
constant-folder doesn't know how to fold strlen.

"Label pointers" should really be "__label__" support.

Thanks for all the information!

-Eli

Yeah, I was just testing you people. Thanks! :wink:

Okay... so, going through the other categories:
Boehm GC: This doesn't look like a clang issue, although I could be wrong.

Nested functions: We aren't going to have these anytime soon. We
don't have a bug on file; go ahead and file it if you feel like it,
though.

Register binding: The uses here look idiotic; I'd suggest patching the
source. We probably will support it at some point, though. This is
PR3933.

__label__ support: This shouldn't be too hard to fix. This is PR3429.

alloca(): I just fixed this.

Implicit definitions of sin/cos: I just fixed this.

-Werror: The errors involving %m are PR4142. The others look like
legitimate warnings; they're asking for it by using -Werror :slight_smile:

--signed-char: The ability to use a signed character type is available
to the AST, but it isn't exposed in a way that can be easily added as
a command-line option. I'm not entirely sure how we want to implement
this. Please file a bug to keep track of this.

--shared: I just fixed this.

-rpath/--no-undefined: These appear to be platform-specific options;
do we have a policy about those at the moment?

-I-: We're not intending to support this; please patch the packages.

-O4: You could patch the packages, although I'm slightly hesitant to
suggest that... anyone have any better suggestions?

Constant folder doesn't support strlen(): This is kind of nasty... I'm
strongly tempted to classify it into the category of "things we will
not support". Fixing it is really simple: change the relevant line to
'static char asn_str[sizeof("65535.65535")];'.

Variable array size in structure: We're not intending to support this;
please patch the package.

Preprocessor varargs macros without ##: I'm not sure what the issue is
here... could you reduce it and file a bug?

noreturn: I think this got fixed recently.

PR4225: I have no clue how hard this is to fix.

PR4290: Not sure how to go about fixing this; I'll discuss it in the bug.

I also tried looking at a few of the Unknown; only able to figure out one:
argp-standalone-1.3.log: Looks like a gnu89 issue, although we don't
actually support the construct in question currently.

-Eli

Hi Eli,

I still have to look at the others. Little bit busy today.

Ah, that would be clang's fault. Fixed in r72727/r72728; you'll have
to rebuild Boehm for it to have an effect.

-Eli

Nested functions: We aren't going to have these anytime soon. We
don't have a bug on file; go ahead and file it if you feel like it,
though.

It's unclear to me that we really want to support nested functions. If implemented, they should be defaulted to off. I really don't like nested functions because they make error recovery for "missing } at the end of a function" completely broken (subsequent functions are parsed as nested functions, hilarity ensues).

__label__ support: This shouldn't be too hard to fix. This is PR3429.

Right.

-Werror: The errors involving %m are PR4142. The others look like
legitimate warnings; they're asking for it by using -Werror :slight_smile:

Warnings should be convertible back into warnings with -Werror -Wno-error=foo

--signed-char: The ability to use a signed character type is available
to the AST, but it isn't exposed in a way that can be easily added as
a command-line option. I'm not entirely sure how we want to implement
this. Please file a bug to keep track of this.

Is this hard to implement? We should just support the same interface that GCC does. If you don't specify a signed-char flag, it defaults to what the target wants, if you do, the command line overrides it.

-rpath/--no-undefined: These appear to be platform-specific options;
do we have a policy about those at the moment?

No idea, Daniel?

-O4: You could patch the packages, although I'm slightly hesitant to
suggest that... anyone have any better suggestions?

This is tricky. I wonder if there is some magic that the clang driver can do for -O4 when the linker doesn't support LTO?

Constant folder doesn't support strlen(): This is kind of nasty... I'm
strongly tempted to classify it into the category of "things we will
not support". Fixing it is really simple: change the relevant line to
'static char asn_str[sizeof("65535.65535")];'.

Definitely shouldn't fix this. This is a great example of a slippery slope: if we implement this, next step is to implement all of "fold". :slight_smile:

Thanks Eli!

-Chris

Yup. It builds now, but I suspect something in w3m miscompiles. It only
shows blank pages. I'll see if I can debug that one of these days.

Thanks!

-Werror: The errors involving %m are PR4142. The others look like
legitimate warnings; they're asking for it by using -Werror :slight_smile:

Warnings should be convertible back into warnings with -Werror
-Wno-error=foo

Then I suppose it's a bug that the warnings in question ("if statement
has empty body" and "overflow converting case value to switch
condition type") don't have corresponding command-line options.

--signed-char: The ability to use a signed character type is available
to the AST, but it isn't exposed in a way that can be easily added as
a command-line option. I'm not entirely sure how we want to implement
this. Please file a bug to keep track of this.

Is this hard to implement? We should just support the same interface that
GCC does. If you don't specify a signed-char flag, it defaults to what the
target wants, if you do, the command line overrides it.

It's not that hard, but I'm not sure what the best way to do it is.
Should it be in LangOptions as a tri-state option which overrides the
target?

-rpath/--no-undefined: These appear to be platform-specific options;
do we have a policy about those at the moment?

No idea, Daniel?

-O4: You could patch the packages, although I'm slightly hesitant to
suggest that... anyone have any better suggestions?

This is tricky. I wonder if there is some magic that the clang driver can
do for -O4 when the linker doesn't support LTO?

Possible, but non-trivial: we'd have to be able to detect .bc files
(both archived and unarchived), extract and compile them separately,
then link them with all the other files. It's certainly doable,
though, if we properly control the toolchain (i.e. we're not using a
gcc-based toolchain).

-Eli