Small patches to allow fully independent clang/llvm/compiler-rt/libc++

I understand that's what you _want_ to do, but I still don't think that's
the _right_ thing to do.

To name a specific example of undesirable effects of this series of patches,
consider someone filing a bug report: in order to diagnose what's going on,
now we'd need more than just the svn revision as printed in the version
string... we'd need to know how the particular build was configured, which
isn't something that the binary is able to tell us.

There's lots of things which can impact the compiler beyond just the
flags. I can add flags (optimizations) to gcc and build a version of
clang which won't produce correct code. How would you diagnosis that?

Passing the testsuite, and LNT is a fairly good indication that you've got a working compiler.... that's what they're for.

(Similar scenario?) What happens if some rare bug slips in and or some
optimization is turned on and then that compiler builds a buggy libc++

libc++ has a testsuite, and passing that testsuite should give some confidence that both the library and the compiler used to build it are doing the right thing.

(same?) I get what you're saying, but I don't think you can perfectly
guard against compile time decisions by adding more flags.

That being said, both those examples are red herrings; hard-to-detect bugs in the bootstrap compiler aren't justification for making such design decisions in Clang, especially when a viable alternative has been proposed.

I don't know why gcc does it, but gcc -v will give you the configure
line which was used to build the compiler. Doing something similar
when these options are enabled would give the information someone
would need. I'm happy to add that change if it makes these patches
palatable.

Renato gives a good explanation for why gcc was architected this way. Clang, on the other hand, was designed to be a universal driver. Not having these kinds of configuration flags baked in at compile time is part of Clang's design philosophy, and I've already explained why.

Thanks for the tip on the test case.

You're welcome.

If the other 3 patches get approval I'll add a test case for the 4th.

In case it wasn't clear, I'm actively objecting to the first three patches, and deferring the 4th for someone else to review.

Jon

Adding an optional compiler time flag can be benign and certainly
doesn't cause a negative impact on test or QA inherently.

This is plainly not true. Every configure-time flag increases the
testing burden by a factor of 2, whereas runtime flags and/or
vendor-specialized triples take on the order of miliseconds to test.

Yes, that's true. But here's the problem: In itself, it's true that
you can do it that way. GCC is proof that is not only doable, but
works "well" for their use case.

But Clang's design, and LLVM overall is that the back-ends are not
special, nor should have a direct relationship with any specific
front-end. Another core design is that LLVM is a library, and as such,
should be used where possible to avoid code duplication and wrong
decision making.

In *many* cases, this is broken in Clang's driver. The fact that Clang
has a different mechanism to infer about the targets is the worse of
them, IMO. This is what changes to the Triple, parsers, Tuple are
trying to solve, and they are orthogonal to this issue. The second
worse (in my book), is that the mechanism for choosing libraries is
completely broken. Choosing the default doesn't fix that, just covers
it up, but lets have it as a poor man's fix for a complex problem.

There are two proposals, and the cut is not perfectly clear, but good
enough for me. I explain:

1. Make Time Selection: Stick some CMake magic like GCC and simplify
the default-selecting mechanism.
  - Pros: very simple to implement, very efficient in LOCs vs.
functionality, low impact on existing builds
  - Cons: need to duplicate the testing matrix (I don't want to break
your stuff), steers away from one of Clang's core design decisions
(see above)

2. Run Time Selection: Create new toolchains, select it with a new triple:
  - Pros: follows one of Clang's core design decisions (see above),
allows us to have multiple toolchain decisions based on run-time
arguments (triple), additive testing
  - Cons: it's not simple nor cheap to implement, adds non-existing
triples which may play badly with some environments, breaks
compatibility with GCC

The strongest factor here is the core design of Clang, which gives a
huge start for option 2. The other smaller points may have different
importances, but all of them flatten next to the design argument.
Being them of similar number on both sides, I think they're not strong
enough to move us strongly towards either direction, thus, the
solution 2 is a clear winner.

Makes sense?

cheers,
--renato

PS: To answer the specific question about test duplication: if we add
a triple, we only need to duplicate the specific tests. If you have to
build different compilers, you'll end up running the whole check-all,
test-suite, benchmarks, most of what will be identical.

* Renato Golin via cfe-dev <cfe-dev@lists.llvm.org> [2015-10-14 17:20:54 +0100]:

This looks like a bug in the driver? Can you create a bug with the
behaviour you see and what would be the expected behaviour?

cheers,
--renato

This seems to be a very similar discussion as was just had about triples.

Before, we discussed that Debian has setup their GCC to default to -march=i586 (and is now planning to switch to i686) for the “i386-linux-gnu” triple. And even more complicated examples for ARM, and other systems.

Here, we have the same exact thing, only a distributor wants to have clang default to -rtlib=compiler-rt.

IMO, the argument that doing this sort of thing is evil and shouldn’t ever be supported, and that there cannot be any distributor customization, is a dead-end. Distributors need the ability to customize this stuff in existing triples – somehow. The same triples REALLY DO have different defaults on different systems…and even on different variants of the “same” system. We don’t want distributors to have to patch the source code to achieve that – that’s worst of all options.

So to me, the real question is not whether this is needed, but how do we let people do it in a supportable/testable way?

I certainly agree with all the statements that compile-time configuration of this should be very much discouraged.

How about, instead, having the clang driver read a config file? The clang driver could default to reading a location within its install directory (overridable via command-line option for testing or other such purposes). The config file might contain something like a list of match criteria for the triple, and the commandline args to add. Of course, clang can easily report the arguments that have been added from the config file in any diagnostic dump, to let anyone else reproduce someone’s compilation.

E.g. lib/clang/$VERSION/driver-config.txt might contain something like this:

DefaultTriple: i386-linux-gnu
AddFlags: i386–linux-* -march=i686
AddFlags: –linux- --rtlib=compiler-rt --stdlib=libc++

This does add a gotcha for anyone building their own clang from source on an existing system: they will need to copy the config file over or pass its path on the command-line to get the same behavior as the system compiler. That is certainly a downside, but to me it seems a worthwhile idea anyways.

Of course, this sort of thing starts to feel like a GCC spec file, and those are totally craaaaaazy! But, since we don’t need to make this be a super-general system for configuring the behavior of EVERYTHING including the way internal bits and pieces work together, I’d hope it’s possible to get away with a much simpler config system like the above, versus the much greater complexity (and power) provided for in a gcc spec file.

This seems to be a very similar discussion as was just had about triples.

sigh :slight_smile:

Before, we discussed that Debian has setup their GCC to default to -march=i586 (and is now planning to switch to i686) for the “i386-linux-gnu” triple. And even more complicated examples for ARM, and other systems.

Here, we have the same exact thing, only a distributor wants to have clang default to -rtlib=compiler-rt.

Which is done by code changes at the moment, see darwin. (This is not an endorsement, just a statement.)

IMO, the argument that doing this sort of thing is evil and shouldn’t ever be supported, and that there cannot be any distributor customization, is a dead-end. Distributors need the ability to customize this stuff in existing triples – somehow. The same triples REALLY DO have different defaults on different systems…and even on different variants of the “same” system. We don’t want distributors to have to patch the source code to achieve that – that’s worst of all options.

I hate saying this, but I could use some examples here. This is one of the places that Daniel and I disagreed I think, or at least things like “I have command line options” here makes sense.

So to me, the real question is not whether this is needed, but how do we let people do it in a supportable/testable way?

Possibly. I can still see people wanting to set defaults that are more than the triple, but not different than the triple.

Example (using gcc build parlance for now):

configure --target=i386-pc-linux-gnu

ok, so this is pretty lame. It doesn’t even have cmov instructions right? So let’s add a -march= flag to all of our compiles.

That said, it’s pretty simple just use configure --target=i586-pc-linux-gnu to solve this, or (for gcc) to use --with-arch which, effectively, does the same thing.

For clang though we don’t have the idea that the compiler is only targeting one thing and so it’s both easier and a little more complicated.

We don’t have to worry about compile time configuration, but making sure the compile time options are correct for the platform and handled well is pretty specific. A specific triple for the baseline stuff like arch and environment will work like it does above, however, additional command line options don’t so much.

More below.

I certainly agree with all the statements that compile-time configuration of this should be very much discouraged.

Yes. I am absolutely and unequivocally against this.

How about, instead, having the clang driver read a config file? The clang driver could default to reading a location within its install directory (overridable via command-line option for testing or other such purposes). The config file might contain something like a list of match criteria for the triple, and the commandline args to add. Of course, clang can easily report the arguments that have been added from the config file in any diagnostic dump, to let anyone else reproduce someone’s compilation.

E.g. lib/clang/$VERSION/driver-config.txt might contain something like this:

DefaultTriple: i386-linux-gnu
AddFlags: i386–linux-* -march=i686
AddFlags: –linux- --rtlib=compiler-rt --stdlib=libc++

I’m not a huge fan of this design, but I think Renato has something that on first glance seemed reasonable. See the message to (I think) llvm-dev or cfe-dev that was sent earlier this week and it has a link to Renato’s idea here.

This does add a gotcha for anyone building their own clang from source on an existing system: they will need to copy the config file over or pass its path on the command-line to get the same behavior as the system compiler. That is certainly a downside, but to me it seems a worthwhile idea anyways.

I misread this the first time, but if you’re talking about something like:

clang -spec=/path/to/blah

we’ve talked about that in the past as not being terrible.

Of course, this sort of thing starts to feel like a GCC spec file, and those are totally craaaaaazy! But, since we don’t need to make this be a super-general system for configuring the behavior of EVERYTHING including the way internal bits and pieces work together, I’d hope it’s possible to get away with a much simpler config system like the above, versus the much greater complexity (and power) provided for in a gcc spec file.

As someone who had to hack on spec files a lot I’d prefer not to go the full route for sure. In particular a feature I don’t want to support is command line translation at the moment.

-eric

This seems to be the preferred solution for everybody that has to deal
with these problems, and we have already discussed this extensively.
The problem is implementing it without disrupting everything else.

It's not impossible, but it requires a long streak of time and effort
to do so, and this is not high enough on people's lists, even those
that deal with it daily, to make the cut.

There's another thread in cfe-dev about the "Clang Configuration
Manager" that has some beginnings, and there are people interested in
doing it. I think it's a start.

About the rest, I agree with Eric's points. :slight_smile:

cheers,
-renato

How about, instead, having the clang driver read a config file?

This seems to be the preferred solution for everybody that has to deal
with these problems, and we have already discussed this extensively.
The problem is implementing it without disrupting everything else.

It’s not impossible, but it requires a long streak of time and effort
to do so, and this is not high enough on people’s lists, even those
that deal with it daily, to make the cut.

There’s another thread in cfe-dev about the “Clang Configuration
Manager” that has some beginnings, and there are people interested in
doing it. I think it’s a start.

Thanks for the subject line pointer, this is the thread I meant. I need to review the proposal a bit more, but on the surface it seems not-terrible :slight_smile:

About the rest, I agree with Eric’s points. :slight_smile:

\o/

-eric

So let me make sure I understand everyone against my patch…

#1 You are furiously against supporting clang which uses compiler-rt, libc++ etc by default? Compiler-rt and the ability to build a non-gnu toolchain by DEFAULT is such a worthless idea that it has no value at all, right?

#2 we love flags and can’t get enough of them…

* Renato Golin <renato.golin@linaro.org> [2015-10-14 20:17:23 +0100]:

I’d missed that thread, sorry! It’s great that there’s some agreement, at least! :slight_smile:

I think the key to getting something accomplished here is to keep it simple: clang has command-line options already. If the configuration file can be purely a way to get extra args included by default for certain targets, it can be a really straight-forward change to make.

It seems to me mainly a question of defining exactly what conditions can be predicated on when choosing whether to add some args.

Thank you!

--renato

Precisely. That's why I want it to be an external Python script for
now, so we can grow with it, and maybe throw it away if we were
completely wrong.

We should only change the driver if the Python script ends up being
most excellent.

cheers,
--renato

#1 You are furiously against supporting clang which uses compiler-rt, libc++ etc by default? Compiler-rt and the ability to build a non-gnu toolchain by DEFAULT is such a worthless idea that it has no value at all, right?

Please avoid doing this. You are intentionally misconstruing what I said in a way to be inflammatory.

I also already support a clang that uses compiler-rt and libc++ by default. This is the default case of the darwin targeting side of clang. Doing this in a way such that generic distributors can specify things is harder versus just changing the defaults for a toolchain.

As far as your patches, there’s some support for the top level for doing a similar thing. See the LLVM_DEFAULT_TARGET_TRIPLE bits. That involves resetting what it thinks the current host is to something else that then gets normalized, etc. It’s a bit of work and honestly not the way I’d like to go about it. I wasn’t hugely in favor of the patch in the first place, but it was somewhat useful from a compatibility argument for existing build systems where they’re dependent on things like mips-elf-gcc being the compiler and the way it’s invoked.

For what sounds to be an OS that has these as set defaults I think we should enable this set of defaults in the driver (ala what Jonathan and Vasileios said) and cross compiling to that OS should be as simple as using the appropriate target yes? For native code it would work the same way. I’m curious what issues you’re seeing with this sort of approach? What use cases you’re seeing?

#2 we love flags and can’t get enough of them…

I’m not sure what you mean here.


I don’t know why anyone even talks about QA or runtime options.

Because they’re important? I’m not sure what your objection here is either.

My patch does NOT or should not introduce new stuff. If your bot doesn’t build compiler-rt today… well it won’t be enabled tomorrow… I’m not proposing to remove flags or driver capability… I’m proposing to allow a compile time option to set defaults…

Please stick to discussing this very specific issue. ‎Apologies if my tone is off

We’ve been doing nothing but talking about your issue, the design, and the code the entire time.

-eric

So let me make sure I understand everyone against my patch..

#1 You are furiously against supporting clang which uses
compiler-rt, libc++ etc by default? Compiler-rt and the ability to
build a non-gnu toolchain by DEFAULT is such a worthless idea that it
has no value at all, right?

No. We're not against having compiler-rt as the default. What we are
against is _how_ these particular patches achieve that.

#2 we love flags and can't get enough of them.. ---- I don't know
why anyone even talks about QA or runtime options.

We're talking about it because QA burden is a legitimate concern, and runtime options are one possible solution.

To make that latter point more concrete, I'm suggesting that you rework your patches so that you use triples like: i686-pathscale-linux-gnu, and then you can set the default runtime lib based on the vendor part of that triple, i.e. 'pathscale'.

My patch does NOT or should not introduce new stuff.

By definition, it does.

If your bot doesn't build compiler-rt today.. well it won't be
enabled tomorrow.. I'm not proposing to remove flags or driver
capability..

Ok. That's tangential though, and doesn't address any of the concerns we've raised about this patch set.

I'm proposing to allow a compile time option to set defaults..

Please stick to discussing this very specific issue. ‎Apologies if
my tone is off

It might help to remember that these criticisms of this patch set are not criticisms of you, nor your character, nor your goals. We're discussing technical merits here, and nobody is "out to get" anyone else... so let's keep it friendly! :slight_smile:

Jon

Small clarification - trying to get this upstream is entirely
altruistic. (PathScale as a product or company doesn't care or need
these upstream at all). I'm trying to do this because I see lots of
people having a difficult time testing OpenMP and other things from
time to time. Some of that work really depends on testing latest ToT
and having packages would make that a lot easier. I'd be really happy
if they were based off 100% open source and I'm not maintaining a
patchset on top of it.

Apologies for not replying to the appropriate sub-threads. Outlook's conversation view is glitching badly on this thread and I can't tell where the tips of the conversation are.

James Y Knight:

IMO, the argument that doing this sort of thing is evil and shouldn't ever be supported, and that there cannot be any distributor customization, is a dead-end.
Distributors *need* the ability to customize this stuff in existing triples -- somehow. The same triples REALLY DO have different defaults on different systems
...and even on different variants of the "same" system. We don't want distributors to have to patch the source code to achieve that -- that's worst of all options.

+1. We've got to deal with this somehow and I don't think we can solve this without compromising our ideals to some degree. I want to sacrifice as little of our one-build-to-support-everything ideal as possible while still being able to solve the problems.

James Y Knight:

So to me, the real question is not whether this is needed, but how do we let people do it in a supportable/testable way?

Absolutely. In the original form of the TargetTuple work, I was trying to achieve this by mapping ambiguous/inconsistent triples to genuinely unique names we can use for support/test/debug. Then if for example, Fedora had a problem we could ask for the tuple and use it to debug on Windows/OS X/Linux/etc. The key to this is that only the triple->tuple mapping is affected by configure-time/config-file differences. Everything else is compiled in

I felt the TargetTuple plan was a good compromise between the ideal world we want (and thought we had) and the flawed world clang lives in. However, run time config files solve the problem too and don't seem to have the same opposition as the configure-time mapping.

Eric Christopher:

I hate saying this, but I could use some examples here. This is one of the places that Daniel and I disagreed I think, or at least things like "I have command line options" here makes sense.

There's several in the trouble with triples thread but to re-iterate a couple here:
* mips-linux-gnu is MIPS-II on Debian and MIPS32R2 on Fedora
* i386-linux-gnu is i486 or i586 on different versions of Debian and I now learn that moving to i686 is on the cards.
* mips-mti-linux-gnu has completely different sysroot layouts in the 2015.06-05 gcc toolchain (which we need for libraries) compared to 2015.01-7.

Renato:

> How about, instead, having the clang driver read a config file?

This seems to be the preferred solution for everybody that has to deal
with these problems, and we have already discussed this extensively.
The problem is implementing it without disrupting everything else.

It's not impossible, but it requires a long streak of time and effort
to do so, and this is not high enough on people's lists, even those
that deal with it daily, to make the cut.

Getting a solution to this problem is very high on my list so I don't mind doing the work. My problems in this area are already serious and are going to get significantly worse in the near future. However, I need to deal with the backend infrastructure first since interpreting the triple (and command line arguments) correctly doesn't help me if I can't deliver that information where it's needed.

I haven't read the Clang Configuration Manager thread yet. I'll go and read that.

I haven't read the Clang Configuration Manager thread yet. I'll go and read
that.

I can see the overlap between this and that. If I understand correctly, it's trying to take a JIT's ability to know exactly what the target hardware is and use that to drive normal compilation. For my case, I want the target to be what the distribution wants it to be (which is often a subset of the actual hardware) but both cases require the same ability to modify the target based on a configuration.

It's actually both. And safe by being an external prototype first.

The first stage is to understand the host/target machines and create a
description (CFLAGS). The second stage is to turn that into a text
database (JSON) so that people can add their own by hand *in addition*
to automatically detect it.

So, in your case, if Clang would understand such database, you could
create fedora-mips-linux-gnu and debian-mips-linux-gnu configuration
*names* (not triples) and use that to set all the flags that make
sense for each.

I believe we have to try very hard to keep compatibility with CFLAGS
even after Clang understands the database because of other compilers.
Once we have meaning inside the database that cannot be represented by
flags, people will be locked in, and I want to avoid that at all
costs. Even if the flags are ugly, they have to be standard. The
ugliness will be hidden by the database anyway.

So, a proposed format would be something like:

[
  { "name": "fedora-mips-linux-gnu",
    "cflags": "-march=mips -mcpu=whatever --blah",
    "arch": "arch",
    "cpu": "cpu",
    "endian": "blah"
  },
]

So, we always keep the cflags / cxxflags, even when all the other
parameters made it redundant. The tool also has to be able to produce
a CFLAGS from the other parameters and update the database. This way,
it would be ludicrously simple to write a script for GCC to use this
without asking them to implement anything on their front-end. We can
also easily add more cflags just for gcc, icc, pcc, armcc, etc.

cheers,
--renato