Small patches to allow fully independent clang/llvm/compiler-rt/libc++

I've sent these patches to some people privately to get their feedback
and now feel it's time to get a broader review.

#1 I really apologize if this stirs things up like the other libc++
thread (somewhat related)

clang-04.diff (2.66 KB)

clang-03.diff (1.24 KB)

clang-02.diff (1.75 KB)

clang-01.diff (2.06 KB)

I don't know if this approach is acceptable to upstream, but I'd like
to build "pure" clang packages without any system GNU dependency.

Hi Chris,

If I get it right, you're setting the defaults at CMake time, so that
you don't have to specify it in the command line for that particular
build.

If that's so, I like where this is going, but I'm just one voice.

These patches allow that by making it a compile time option, but
shouldn't impact any current defaults.

Some people may disagree on the principle, even if it doesn't affect
current builds. But I'll let them voice their concerns. I may be
biased towards build time parametrisation, so even my opinion is
weakly held.

cheers,
--renato

PS: Are you missing libc++abi intentionally? If not, may be good to add, too.

In regards to your comment at the top - exactly. I'm hoping others
don't mind allowing this to be an optional build option.

libc++abi - Not intentionally, but I'm unable to test it at the
moment. It should be trivial for someone else to add it.

Ok.

Some people may disagree on the principle, even if it doesn't affect
current builds. But I'll let them voice their concerns.

IMHO, the "correct" approach for a GNU-free/independent clang package requires
a new ToolChain class for LLVM-based toolchains. Maybe, this class could
inherit from the Linux ToolChain and use most of the existing infrastructure
for paths, options & tools invocations. Otherwise, we could use these
configuration-time options for most of the GNU-dependencies that we want to
replace. However, I think that we would be duplicating effort/functionality
at two different places (cmake & clang source code). Additionally, other than
compiler-rt, libcxx, libcxxabi, etc., we'd need a non-GLIBC/GNU C library
which might have a different set of CRT files with its own naming scheme for the
dynamic linker/loader.

Having said all that, I don't explicitly disagree with these set of patches.
They could offer a reasonable trade-off for the time being.

- Vasileios

Some people may disagree on the principle, even if it doesn't affect
current builds. But I'll let them voice their concerns.

IMHO, the "correct" approach for a GNU-free/independent clang package requires
a new ToolChain class for LLVM-based toolchains. Maybe, this class could
inherit from the Linux ToolChain and use most of the existing infrastructure
for paths, options & tools invocations. Otherwise, we could use these
configuration-time options for most of the GNU-dependencies that we want to
replace. However, I think that we would be duplicating effort/functionality
at two different places (cmake & clang source code). Additionally, other than
compiler-rt, libcxx, libcxxabi, etc., we'd need a non-GLIBC/GNU C library
which might have a different set of CRT files with its own naming scheme for the
dynamic linker/loader.

Something along these ^ lines is the "correct" way to do it. You'd have to add your own vendor to Triple.h, and then specialize these defaults based on seeing that particular vendor. Baking these decisions in at compile-time is an anti-pattern driver-wise.

If you do it this way, then you'll be able to write testcases for your changes, and those tests will get run by all the buildbots (not just one configured to have one particular set of defaults). This has the added benefit of not requiring more builders to test more configurations.

Please don't commit these patches as-is.

Jon

Actually..... I agree with you and I'm not sure if it's exactly what
you're referring to, but could be along those lines
https://github.com/pathscale/llvm-suite/

The above is mostly a convenience cmake "wrapper" or parent project
which helps glue everything together so that with a single cmake you
can build and configure the whole thing at once. Internally we handle
the CRT problem, but I'm not sure if the above does. (I'd have to
double check, but it would also probably require more clang driver
changes as well)

One of the things I personally like about the above is that it allows
the "just built" clang to then build the runtimes. That way if you
statically link clang - there's no dependency on the host system.
(Barring libc, but that's typically future compatible. I've tested
this all the way back to SLES10)

If the patches get pushed upstream I plan to start nightly builds of
GNU free packages across many platforms (Solaris, Win, OSX, Linux and
various targets) My goal is to help make it easier for people in the
community to test nightly packages instead of having to build it. (At
all users can or want to build the whole thing)

Lastly - with a bit of modification the above also allows you to do
native and cross compiler builds at the same time/same package. (Lets
call that phase 2)

Can you give precise feedback about what's wrong with the patches
(as-is)? I'd like to "fix" them. Is it philosophical or something
wrong?

Something along these ^ lines is the "correct" way to do it. You'd have to
add your own vendor to Triple.h, and then specialize these defaults based on
seeing that particular vendor. Baking these decisions in at compile-time is
an anti-pattern driver-wise.

Hum, I like this better!

Not that I like Triples, but I think this is the right way to go.
We'll probably find a lot of problems when integrating it with
distributions and systems, but it's a decision that makes sense from
this premature point of view.

If you do it this way, then you'll be able to write testcases for your
changes, and those tests will get run by all the buildbots (not just one
configured to have one particular set of defaults). This has the added
benefit of not requiring more builders to test more configurations.

Good point. We suffer from that problem in GCC validation.

--renato

Some people may disagree on the principle, even if it doesn't affect
current builds. But I'll let them voice their concerns.

IMHO, the "correct" approach for a GNU-free/independent clang package
requires
a new ToolChain class for LLVM-based toolchains. Maybe, this class could
inherit from the Linux ToolChain and use most of the existing
infrastructure
for paths, options & tools invocations. Otherwise, we could use these
configuration-time options for most of the GNU-dependencies that we want
to
replace. However, I think that we would be duplicating
effort/functionality
at two different places (cmake & clang source code). Additionally, other
than
compiler-rt, libcxx, libcxxabi, etc., we'd need a non-GLIBC/GNU C library
which might have a different set of CRT files with its own naming scheme
for the
dynamic linker/loader.

Something along these ^ lines is the "correct" way to do it. You'd have to
add your own vendor to Triple.h, and then specialize these defaults based on
seeing that particular vendor. Baking these decisions in at compile-time is
an anti-pattern driver-wise.

If you do it this way, then you'll be able to write testcases for your
changes, and those tests will get run by all the buildbots (not just one
configured to have one particular set of defaults). This has the added
benefit of not requiring more builders to test more configurations.

Please don't commit these patches as-is.

Can you give precise feedback about what's wrong with the patches
(as-is)? I'd like to "fix" them. Is it philosophical or something
wrong?

To re-phrase, I think the "spirit" behind clang-01.diff, clang-02.diff and clang-03.diff are contrary to the design goals of the driver. I think you ought to re-work them so that they don't touch CMakeLists.txt at all, that way all of these defaults are set at compiler runtime via the vendor part of the triple (or some flag(s)), not at compiler compiletime.

I'm not sure I'm the best person to review clang-04.diff, but at the very least it needs testcases.

Jon

The whole point of the patches is not to touch the default so "driver
and flag people" can have it their way, but non-flag people can have a
default. This doesn't hurt the driver and anyone using it would be
opt-in.

For 04 - can you give advice on how to make a testcase for it? I'm not
100% clear on how to test for a regression on this.

I've been told Musl (http://www.musl-libc.org/) works well as a glibc
replacement for OpenEmbedded on AArch64, and it even interoperates
better than expected with libc++.

Not sure about the other targets, but may be worth considering.

cheers,
--renato

I've been told Musl (http://www.musl-libc.org/) works well as a glibc
replacement for OpenEmbedded on AArch64, and it even interoperates
better than expected with libc++.

Not sure about the other targets, but may be worth considering.

I'm able to build a GNU-free toolchain with Musl for the new mips-mti-linux
triple (http://reviews.llvm.org/D13340). We are passing the LLVM test-suite
for static builds with MIPS32 O32 LE/BE. It should be working fine on other
targets too.

- Vasileios

If others help integrate that into the work as an optional component
it works for me. I don't have the bandwidth to QA a libc change in
addition to the other things.

I'm not there yet, either. I'm focusing on libc++ for now.

That's cool, but it depends on a triple flag. I really would like and
need something which doesn't depend on the user messing with compiler
flags.

Is the consensus around here really so extremely strong against a
build time option or just a few people who speak up?

Some people may disagree on the principle, even if it doesn't affect
current builds. But I'll let them voice their concerns.

IMHO, the "correct" approach for a GNU-free/independent clang package
requires
a new ToolChain class for LLVM-based toolchains. Maybe, this class could
inherit from the Linux ToolChain and use most of the existing
infrastructure
for paths, options & tools invocations. Otherwise, we could use these
configuration-time options for most of the GNU-dependencies that we want
to
replace. However, I think that we would be duplicating
effort/functionality
at two different places (cmake & clang source code). Additionally, other
than
compiler-rt, libcxx, libcxxabi, etc., we'd need a non-GLIBC/GNU C
library
which might have a different set of CRT files with its own naming scheme
for the
dynamic linker/loader.

Something along these ^ lines is the "correct" way to do it. You'd have
to
add your own vendor to Triple.h, and then specialize these defaults based
on
seeing that particular vendor. Baking these decisions in at compile-time
is
an anti-pattern driver-wise.

If you do it this way, then you'll be able to write testcases for your
changes, and those tests will get run by all the buildbots (not just one
configured to have one particular set of defaults). This has the added
benefit of not requiring more builders to test more configurations.

Please don't commit these patches as-is.

Can you give precise feedback about what's wrong with the patches
(as-is)? I'd like to "fix" them. Is it philosophical or something
wrong?

To re-phrase, I think the "spirit" behind clang-01.diff, clang-02.diff and
clang-03.diff are contrary to the design goals of the driver. I think you
ought to re-work them so that they don't touch CMakeLists.txt at all, that
way all of these defaults are set at compiler runtime via the vendor part of
the triple (or some flag(s)), not at compiler compiletime.

I'm not sure I'm the best person to review clang-04.diff, but at the very
least it needs testcases.

The whole point of the patches is not to touch the default so "driver
and flag people" can have it their way, but non-flag people can have a
default. This doesn't hurt the driver and anyone using it would be
opt-in.

I understand that's what you _want_ to do, but I still don't think that's the _right_ thing to do.

To name a specific example of undesirable effects of this series of patches, consider someone filing a bug report: in order to diagnose what's going on, now we'd need more than just the svn revision as printed in the version string... we'd need to know how the particular build was configured, which isn't something that the binary is able to tell us.

I'm pretty sure Eric (cc'd) would veto this patch for the same reason, but I'll let him state his own opinion on it.

For 04 - can you give advice on how to make a testcase for it? I'm not
100% clear on how to test for a regression on this.

Since this requires that the driver be in the sysroot, you'll need to create a mock sysroot in `clang/test/Driver/Inputs`, and then have the testcase copy that to %T, set up a symlink from %T/bin/clang to the driver binary itself, then invoke the driver via that symlink, perform the checks on the -cc1 string, and finally clean everything up. Something like `clang/test/Driver/mips-fsf.cpp` would be a good start for comparison on the checks themselves.

On another note, I'm a little surprised there isn't existing testsuite coverage for these particular lines.

Jon

There's lots of things which can impact the compiler beyond just the
flags. I can add flags (optimizations) to gcc and build a version of
clang which won't produce correct code. How would you diagnosis that?
(Similar scenario?) What happens if some rare bug slips in and or some
optimization is turned on and then that compiler builds a buggy libc++
(same?) I get what you're saying, but I don't think you can perfectly
guard against compile time decisions by adding more flags.

I don't know why gcc does it, but gcc -v will give you the configure
line which was used to build the compiler. Doing something similar
when these options are enabled would give the information someone
would need. I'm happy to add that change if it makes these patches
palatable.

Thanks for the tip on the test case. If the other 3 patches get
approval I'll add a test case for the 4th.

GCC does it because the back-ends were never meant to work on the same
code. That was a decision taken many years ago and the make flags is
just a consequence of that.

For the last 2 years Linaro was trying to make the GCC ARM and AArch64
back-ends work together, but so far, haven't succeeded. I don't want
to us to get into that situation.

I know that's not what you're proposing, but Jon's concerns about
testing and overall design are indeed compelling. And this opens up to
changes that will get us in a place that is not exactly like GCC, nor
current LLVM, which is probably worse. Moreover, the proposal of
having an LLVM toolchain actually solves most of the problems and is
indeed on par with the rest of the driver's design. I agree it's the
best solution.

Other bugs can happen, yes, but we don't need to add more uncertainty.
We have plenty already.

cheers,
--renato

*some* default is already being decided. you must use a flag to
override that... the flag to overwrite that default can still exist.
(I don't think the patch removes that??) The driver doesn't decide
*everything*. These arguments aren't logical. Adding an optional
compiler time flag can be benign and certainly doesn't cause a
negative impact on test or QA inherently.