unable to compile llvm with gcc 4.7.4

Suggestion is not clear answer. How such a decision taken? Is there a board of
people which have to vote to valid the choice of minimal gcc (and clang) version
effective?
Because, currently, the LTO lib caching code (and certainly more as Teresa
Johnson pointed out) should be patched.

If gcc 4.7 (last 1-step C boostrap-able c++-ish compiler) is phased out, then,
to bootstrap llvm from a C compiler/runtime, gcc(4.7.4) + gcc(version>=4.8)
will have to be setup first.

Suggestion is not clear answer. How such a decision taken? Is there a board of
people which have to vote to valid the choice of minimal gcc (and clang) version
effective?

We don't have such process, unfortunately. :slight_smile:

If gcc 4.7 (last 1-step C boostrap-able c++-ish compiler) is phased out, then,
to bootstrap llvm from a C compiler/runtime, gcc(4.7.4) + gcc(version>=4.8)
will have to be setup first.

I want to understand your constraints, as I think this is a unique
case that I wasn't considering.

I normally worry about what's available on systems, so that users can
"just compile" using the system compiler and libraries. That's why I
only worry about 4.8+, because that's what's available on most old,
stable systems.

But you seem to need a C compiler and bootstrap GCC 4.7. Is this a
full bootstrap? Including glibc and binutils? If so, how do you
control their versions?

More importantly, why can't you just use the GCC that comes with
distros, or why can't you just compile your toolchain once and use
everywhere?

After all, GCC has moved on from plain C because it wasn't that
important to most of them either.

So, this trend is not LLVM-specific, but it's aggravated in LLVM
because we like more shiny toys than the GCC devs. :slight_smile:

cheers,
--renato

To the best of my understanding - because we want to be able to bootstrap clang with the system compiler that ships with various linux and BSD distributions.
Windows has no equivalent concept.

I mean no offense to linux/BSD developers, but should we have a discussion (in another thread perhaps) about whether its reasonable to treat them specially in this regard?

Both macOS and Windows developers need to download compilers separately to be LLVM developers. Why shouldn’t linux/BSD developers?

Given that we ship prebuilt binaries for many distros, it seems like its easy to get a new enough compiler. This way we won’t be faced with the problem of old GCCs holding us back in future.

Pete

Hi Peter,

We don’t ship proper toolchains for distros, we ship toys that worked on one developer’s machine, and that includes our binaries as well.

Distro validation is a huge deal, especially toolchain, and we’re in no way equipped to even try to do a bad job at that.

Unix is a different world than that of Windows and Mac, and it makes no sense to force the same requirements, either way.

I don’t opine on msvc or xcode topics or design decisions because I don’t know enough to have any reasonable opinion, and I most certainly won’t try to make them more like Unix.

These things are what they are for good reasons and I think we should just leave it at that.

Cheers,
Renato

PS: I don’t mean disrespect either, but we have this discussion every time someone mentions upgrading gcc.

Maybe we should write some documentation to avoid repeating the same arguments. :slight_smile:

UNIX has a long history of shipping a usable compiler. A system without
a compiler was considered crippled. It shouldn't be surprisingly that at
least one of the two systems comes from a company originally known for
selling BASIC...

Joerg

Hi Peter,

We don’t ship proper toolchains for distros, we ship toys that worked on one developer’s machine, and that includes our binaries as well.

Ok, thats fair. Having never actually downloaded one, I didn’t know the state of it.

Out of interest, are newer toolchains available from apt-get and other similar package managers? The only part of supporting linux/BSD in this way I question is that we end up stuck with whatever shipped with the OS, not what may be available via their package managers. If the available packages are also old GCC versions then fair enough, but I just want to understand the ecosystem.

Distro validation is a huge deal, especially toolchain, and we’re in no way equipped to even try to do a bad job at that.

Unix is a different world than that of Windows and Mac, and it makes no sense to force the same requirements, either way.

I guess I already alluded to this, but if a package is available in the same way that a new Xcode/MSVC is available, then i’m not sure why we should treat it differently.

I don’t opine on msvc or xcode topics or design decisions because I don’t know enough to have any reasonable opinion, and I most certainly won’t try to make them more like Unix.

You should. We need plenty of voices to make sure we make the best decisions we can. Incidentally, i’ve built GCC in a past life on the PlayStation 2 Linux environment. I’m somewhat aware of the pain, although that was quite a time ago.

These things are what they are for good reasons and I think we should just leave it at that.

Cheers,
Renato

PS: I don’t mean disrespect either, but we have this discussion every time someone mentions upgrading gcc.

None taken.

Maybe we should write some documentation to avoid repeating the same arguments. :slight_smile:

SGTM. I’m willing to admit my almost complete ignorance in all things linux/BSD, so to have it written down somewhere and avoid the next person like me asking would be good.

Cheers,
Pete

Well, many production Unix systems still have full program compilation as package delivery system in this day and age. Many, including me, consider this a feature. :slight_smile:

Cheers,
Renato

Trying to use more than one GCC version at the same is a major PITA.
Just imagine the fun when you want to use a system C++ library and you
have two different copies of libstdc++ pulled in. Heck, you can even get
funny issues with libgcc_s because every other main GCC branch adds
something fancy to it. Often, you end up with two copies of a lot of
major libraries just so that you can get a consistent state. The
situation is a bit better with Clang as it is easier to decouple libc++
and clang, but e.g. the proposed move to C++14 would still require a
libc+ update.

Joerg

Stable releases don’t upgrade the toolchain because the whole system is guaranteed to be stable under a single combination of the compiler, libraries, tools, headers, secondary libraries, etc.

Validation of a new toolchain means releasing a whole new distro, with old packages, but they’re busy validating new releases, too.

The gcc abi 5 issue is on recent event that you could have a read to understand what kinds of problems can happen.

Windows just duplicates all libraries and each app is independent. You can use different compilers, but you end up with a big list of dlls all over the place, and compatibility is never guaranteed.

How Steam has hacked Linux support is a good example of Windows style on Linux, and the quality you get is a good reason not to follow that path.

They’re just different models, with different constraints. If we were a distro, trying to change the status quo would be fun. We don’t have that bandwidth, so we just do what everyone else is doing. :slight_smile:

Cheers,
Renato

I went ahead and started up an RFC thread for bumping the min GCC version to 4.8. It seems like moving the minimum Clang version could be proposed separately.

Thanks,
Teresa

Thanks Teresa!

I think the point about having a bot on the minimum version is more important than bumping the gcc version, that’s why I suggested bundling the clang discussion.

Cheers,
Renato

It's my custom distro. My goal is to make it boostrap-able with tinycc (or any
little C compiler alternative) as a one-man reasonable job. With the removal of
gcc 4.7 support now official, I would need to have a 3 step bootstrap, adding a
modern gcc (which is guaranted to compile with iso c++98-ish gcc 4.7.4, feature
that clang cannot guaranted anymore).

I'm targetting llvm only for its AMDGPU backend (AMD southern island GPU
architecture). Hacking the build system made me manage to compile llvm libs
with gcc 4.7.4:
  - I removed libLTO, all tools except llvm-config.
  - all tools except llvm-config.
Till mesa compiles and runs fine, I don't care about the rest of llvm.

Hi Sylvain,

I have to say, after a while thinking about your use case, I cannot
come up with a better solution than a 3-stage build. :frowning:

Maybe you need to step back a bit and ask yourself: what would be the
system changes to adopt GCC 4.8 natively instead of tinycc.

What distributions do is to compile the base GCC they'll use first,
making sure all the correct libraries in all the correct versions are
bundles in the right places, then use that toolchain for *everything*.

You seem comfortable enough building GCC 4.7, I assume as a side
package, like BSD ports. I'm also assuming you already need GCC (for
packages other than LLVM), then why not make GCC your system compiler?

The dependencies will already be there anyway, and I don't think GCC
4.8's libraries are much bigger than 4.7, so it does seem like an
overall gain.

Of course, it'll mean you'll have to test your packages with GCC 4.8,
but assuming they already use tinycc or GCC 4.7, I hope you'll have
very little additional problems.

Would any of that help?

cheers,
--renato

Hi,

The problem is modern c++. I can have a reasonable system boostrape-ed
with (tinycc/alternative C compiler), but only in the gcc world since
a modern c++ compiler is only bootsrape-able from near any C compiler
there. clang and llvm are unable to do it. That why I would need to
get 2 gccs: "any C compiler" -> gcc 4.7.4 -> gcc recent_version ->
llvm.

The problem is modern c++.

I rather consider it a solution to the weirdness of C++98, but I see
your point. :slight_smile:

I can have a reasonable system boostrape-ed
with (tinycc/alternative C compiler), but only in the gcc world since
a modern c++ compiler is only bootsrape-able from near any C compiler
there. clang and llvm are unable to do it. That why I would need to
get 2 gccs: "any C compiler" -> gcc 4.7.4 -> gcc recent_version ->
llvm.

I got it, and I understand your constraints. We have similar issues
when building GCC from scratch.

But for your distro, you only do that once per release, right?

Once you have your system compiler, than shipping it as a binary
package (or part of the base installation), should make it trivial to
all users.

Even if your system's package manager is port-based (like
Gentoo/BSD/AUR), that you need to build packages from scratch, you'd
still need the system compiler to be already installed, right?

And I'm assuming that, once you picked a system compiler, you stay
with it for at least a few months, which makes the 4-stage
recompilation a pain, but not a critical issue.

Or am I assuming too much?

cheers,
--renato

I think this 3 stage bootstrap is just a fact of modern life, and progress of languages. In fact I think you’re getting away lightly, and I’m amazed you could use only a 2-stage bootstrap from a very simple C compiler until now!!

The good news: it should be a very very long time before you need a 4 stage bootstrap :slight_smile:

Just for the interest of discussion, I find it completely weird and interesting that GCC needs to build itself 3 times to fully bootstrap. Has there been any interest in looking at a single compile build? I don’t exactly know the limitations, but my naive thinking is that C++14 compiler source parsed by C++14 capable compiler and codegen’d to C99 (or older) source should make it compilable by older compilers. Is this just a delusion or an actually useful idea?

Regards,

Kevin

Far from being an expert, my understanding is that this is largely due
to the libraries and tools.

GCC has a reduced sub-set of the compiler that works with many old
compilers, and they build that one first, then use that one to build
the required libraries, tools, and the complete compiler, than use the
complete compiler to bootstrap. You can also have cross-bootstrap, or
Canadian cross, which increase the complexity of the builds by a
reasonable margin.

LLVM doesn't do that because we rely on the system's libraries, which
honestly is a bad habit. This bad habit made the edges between RT,
libc++ and LLVM a bit rough, especially when cross compiling and
re-using those tools to bootstrap. It also makes it very hard to have
stable tests, especially in between "package upgrades".

It should be possible to bootstrap Clang in only two stages, but that
requires a lot of CMake magic if we want to get *all* components
built, including RT, libc++ and lld.

However, none of that mentions the C library, which is a whole new
problem if Clang can't compile it. I believe we still can't compile
the GNU C library, but we can compile Musl (at least for some
targets), so we could include Musl on such a two-stage magical
bootstrap...

But that's a lot of work... :slight_smile:

cheers,
--renato

> Just for the interest of discussion, I find it completely weird and
> interesting that GCC needs to build itself 3 times to fully bootstrap.
Has
> there been any interest in looking at a single compile build?

(I know renato did not write this, but i'll just answer this here)
This is deliberate and necessary if you want to be sure.
You can do a single compile build by just "not bootstrapping".

The reason boostrapping is 3 stages is to find optimizer bugs.

stage 1: Compile new compiler
stage 2: Compile self with new compiler // IE detect any obvious bugs in
new compiler, like ICE, etc
stage 3: Compiler self with stage 2 compiler // IE detect miscompiles
caused by new compiler being broken

Stage 2 and 3 should be identical, if they aren't, you have a minimum of
non-determinism, and more likely, a codegen bug somewhere.

Otherwise, stage2 could be very broken and you may not notice, because the
compiler has relatively few compile + execute tests (since they are very
hard to write)
Instead, they rely on the one large execution test they know they can use:
the compiler itself.

Note that 3 stage bootstraps are a technique that predates gcc :slight_smile:

I don't
> exactly know the limitations, but my naive thinking is that C++14
compiler
> source parsed by C++14 capable compiler and codegen'd to C99 (or older)
> source should make it compilable by older compilers. Is this just a
delusion
> or an actually useful idea?

Far from being an expert, my understanding is that this is largely due
to the libraries and tools.

GCC has a reduced sub-set of the compiler that works with many old
compilers, and they build that one first, then use that one to build
the required libraries, tools, and the complete compiler, than use the
complete compiler to bootstrap.

This is not correct :slight_smile:

First stage of gcc is the entire compiler, not a subset or a different
compoiler.

You can also have cross-bootstrap, or

"any C compiler" must have been bootstrapped some way as well. The
only compiler that doesn't need bootstrapping is one written in
machine code. That is, one compiles a statically-linked version of gcc
and copies it over to the new machine, assuming its ISA is
backwards-compatible.

If a modern compiler is required on the target machine, why not
cross-compiling it? There is (or should) be no reason why another
platform cannot generate the same binary result.

Michael