LLD to be the default linker in Clang

Folks,

I'm creating a bootstrap buildbot on AArch64 with LLD and I just
realised the "accepted" way to make clang call lld is to "symlink lld
-> ld". I understand that's how every Linux system "chooses" the
linker, but that makes deployment and validation quite cumbersome on
GNU systems.

I'd like to suggest a change in behaviour:

// Some flag like --linker=<full path / bin on path>
if (LinkerFlag) {
  linker = Flag (linker);

// triple != host
} else if (CROSS_COMPILE) {
  if (LLDSupports(triple))
    linker = Find (LLD);
  if (!linker)
    linker = Find (triple-ld);
  if (!linker)
    ERROR; // *NOT* the system linker!

// triple = host
} else {
  linker = Find (LLD);
  if (!linker)
    linker = Find (SYSLD); // OS-specific
  if (!linker)
    ERROR; // We tried!
}

  Rationale

My reason is that, if you just build Clang, with or without LLD,
everything works out of the box as you expect: Former uses LLD, latter
uses the system's linker. If LLD is able to cross-compile to the
target triple, and it's available, try that. Cross compilers should
never use the system linker, as that's just silly.

However, if you didn't build Clang or LLD and still want to force the
linker (cross when clang gets it wrong, lld installed somewhere else,
some non-sysroot alternative, ld when you have built lld), you'll need
a flag. It doesn't really matter if GCC will ever use LLD, but it
would be good to have Clang be able to specify which linker to use.

We already have library flags, and we don't need an assembler flag, so
the linker seems like the last option missing.

  Use Case

For example, it's perfectly reasonable to have GCC and Clang on the
same system and to have LD and LLD installed / accessible. It's also
perfectly reasonable to have GCC using LD and Clang using LLD on the
same system. Today, that's not possible without changing the path for
Clang and not GCC (cumbersome, to say the least).

The environment above is *exactly* that of any buildbot trying to
bootstrap Clang+LLD using GCC+LD. Iwant to have at least one for
AArch64 and one for ARM, but it would be good to have the same thing
for x86_64, too at the very least.

I don't know much about FreeBSD, but they're moving LLD as the
official linker in multiple platforms and they still have GCC/LD in
ports. There will probably be corner cases...

  Conclusion

I think LLD is mature enough to be preferred over LD on the platforms
it support, if available.

Since it's not available by default in most of them, its existence
means intention.

Once it becomes available, having it means you should really use it.

Looks like a no-brainer to me. Am I missing something?

cheers,
--renato

I'm creating a bootstrap buildbot on AArch64 with LLD and I just
realised the "accepted" way to make clang call lld is to "symlink lld
-> ld". I understand that's how every Linux system "chooses" the
linker, but that makes deployment and validation quite cumbersome on
GNU systems.

There's also -fuse-ld=

That's how I usually do it.

Right, that gets rid of the override flag. Thanks! :slight_smile:

But the arguments about the default and the cross-compilation error still stand.

cheers,
--renato

It should be possible to search the path for ld.lld first, then ld,
but to me it seems like it will just be more confusing. Also AFAIK
the -fuse-ld flag has no way to specify plain ld, although we could
allow -fuse-ld= to specify that.

Clang's current approach (ld from $PATH by default, or ld.arg from
-fuse-ld=arg) suits us in FreeBSD. Today we have:
    /usr/bin/ld.bfd (GNU ld 2.17.50)
    /usr/bin/ld (ld.bfd symlink)

We'll soon add
    /usr/bin/ld.lld
and in fact we'll want the default to remain /usr/bin/ld until we're
confident lld is a suitable replacement for all cases. At that point
we'll change the ld symlink to ld.lld.

Installing the binutils port adds:

/usr/local/bin/ld.bfd
/usr/local/bin/ld.gold
/usr/local/bin/ld (/usr/local/bin/ld.bfd hardlink)

The only downside of -fuse-ld= that I'm aware of affects GCC only, not
Clang: GCC accepts only -fuse-ld=bfd and -fuse-ld=gold, and Davide's
patch to add -fuse-ld=lld was met with hostility. But that shouldn't
have any effect on Clang.

It should be possible to search the path for ld.lld first, then ld,
but to me it seems like it will just be more confusing.

Hum, for me it would be less confusing. :slight_smile:

GCC uses bfd by default, LLVM uses LLD. If you want to change, use -fuse-ld.

What would be confusing in this scenario?

Clang's current approach (ld from $PATH by default, or ld.arg from
-fuse-ld=arg) suits us in FreeBSD. Today we have:
    /usr/bin/ld.bfd (GNU ld 2.17.50)
    /usr/bin/ld (ld.bfd symlink)

We'll soon add
    /usr/bin/ld.lld

This seems like a simple enough approach. I guess my problem is more
about building toolchains than shipping them.

The only downside of -fuse-ld= that I'm aware of affects GCC only, not
Clang: GCC accepts only -fuse-ld=bfd and -fuse-ld=gold, and Davide's
patch to add -fuse-ld=lld was met with hostility. But that shouldn't
have any effect on Clang.

Just had a look at the thread. Some silly comments were expected, but
I think Davide has withdrawn his patch too early.

AFAIK, there's absolutely nothing in the GCC license, moto and
copyright statements that forbid the usage of a non-GNU linker, and
the argument "lld has bugs" is pointless.

I would try again.

cheers,
--renato

>> There's also -fuse-ld=
>>
>> That's how I usually do it.
>
> Right, that gets rid of the override flag. Thanks! :slight_smile:
>
> But the arguments about the default and the cross-compilation error
still stand.

It should be possible to search the path for ld.lld first, then ld,
but to me it seems like it will just be more confusing. Also AFAIK
the -fuse-ld flag has no way to specify plain ld, although we could
allow -fuse-ld= to specify that.

IIUC, you can pass a full path to -fuse-ld, so -fuse-ld=`which ld` should
specify the default ld.

Clang's current approach (ld from $PATH by default, or ld.arg from

> It should be possible to search the path for ld.lld first, then ld,
> but to me it seems like it will just be more confusing.

Hum, for me it would be less confusing. :slight_smile:

GCC uses bfd by default, LLVM uses LLD. If you want to change, use
-fuse-ld.

What would be confusing in this scenario?

In practice, I think your proposal can be read for most people as a
proposal to do dogfooding LLD more widely in LLVM by making it the default
linker. I've been using LLD as default for a long period of time (probably
a year) and saw no problem. So it should be doable. Do people okay with
that? Probably, as long as it works, no one would really care, maybe?

Dog-feeding and keeping linkers honest. Yes.

Just like Clang made C++ compilers honest and IAS made assemblers
honest. It's about time we do that to linkers.

With the integrated assembler, as soon as it was good quality enough
in one arch, we turned on by default and asked people to report a bug
and meanwhile use "-no-integrated-as" in their flags.

As we migrated the ARM back-end to use IAS, we decided to change it by
default even though a few programs didn't compile entirely (mostly due
to GNU extensions). We still have some programs that don't compile
with IAS, and we're working on the bugs that we see in bugzilla, but
they're by far the exception. Most of the remaining cases were fixed
in the user's code.

The AArch64 port is finishing its stage 1, aka. bootstrap. The
buildbot I'm setting up it to prove it works as well as make sure we
don't regress. From there, test-suite, then chromium and firefox are
the next targets.

The ARM port is work in progress, but will follow a similar path, and
once it's working well, we'll want to make it the default in Clang.

So, we need to make sure the "does it work" flag is correct. There are
two modes, native and cross, and they can have different values at
different times. We'll have to validate them separately.

To begin with, I would set LLD the default on native x86_64 only,
because that's what gets tested the most nowadays. As soon as AArch64
bootstraps and passes the test-suite, we can do the same for native,
but not cross. Same for ARM.

The good news is that when you "apt-get install clang", you don't get
LLD. So, unless people are installing LLD in their systems explicitly,
or are building LLD from sources with Clang, they will *not* use LLD.

The only confusion, maybe, is in FreeBSD if they have all linkers
installed but still don't want to use LLD as the default for Clang. It
should be trivial to change that with a one-line patch, in the Clang
driver, though. At least until they have validated enough.

cheers,
--renato

I did not realize LLD was already far enough along to use. I have a related question: What about using LLD via library API?

I would love to link against LLD and call API functions instead of trying to find the system linker and spawning a child process and having different code for each system linker. If I could use LLD as a library that would be one less moving part in my compiler, one less potential thing that could go wrong for my users.

GCC doesn’t use BFD by default, it uses /usr/bin/ld (or ${PREFIX}/bin/ld). If that is a symlink to ld.bfd, ld.gold, Apple ld64, or ld.lld, it will use whichever. I think it would be very confusing for clang to use a linker that is not whatever the host system has decided the default linker should be, unless there is some compelling reason to pick a different one (e.g. you’re doing LTO with a mechanism that absolutely requires lld and not ld64/gold).

David

GCC doesn’t use BFD by default, it uses /usr/bin/ld (or ${PREFIX}/bin/ld). If that is a symlink to ld.bfd, ld.gold, Apple ld64, or ld.lld, it will use whichever.

Hum, that's actually a good point.

I think it would be very confusing for clang to use a linker that is not whatever the host system has decided the default linker should be, unless there is some compelling reason to pick a different one (e.g. you’re doing LTO with a mechanism that absolutely requires lld and not ld64/gold).

If you build GCC, and binutils, "ld" is on the path, and it all
bootstraps without glitches. If we build Clang and LLD, "lld" is on
the path, but Clang still picks the system's "ld", which is weird.

So, let's say we don't make it the default (because of your argument
above), what would be the steps to make bootstrap transparent?

Today, we can:
1. Use -DLLVM_ENABLE_LLD=True, but that only works for tests and
bootstrap (right?)
2. Use "--cflags -fuse-ld=lld" for LNT, so we can run the test suite
(I'm assuming this works, too)
3. Add an "ld" symlink into $BUILD/bin, together with the already
existing "ld.lld" and "lld-link" and put $BUILD/bin before /usr/bin on
the $PATH

They all seem kosher enough, so I'd be ok with any of them, but
they're all *additional* steps, not the default.

So maybe, a better approach would be to make LLD the default in our
own CMake scripts, via (1) above, not in the Clang driver, IFF the LLD
project is built-in.

Setting (3) above would make the test-suite transparent, too. Again,
IFF LLD is built-in.

Makes sense?

--renato

> It should be possible to search the path for ld.lld first, then ld,
> but to me it seems like it will just be more confusing.

Hum, for me it would be less confusing. :slight_smile:

GCC uses bfd by default, LLVM uses LLD. If you want to change, use
-fuse-ld.

I don't think that's a fair comparison. GCC uses the system linker by
default (whatever 'ld' happens to name). If ld is a symlink to lld, then
GCC uses lld by default.

That's actually part of a much larger pattern: Clang is currently set up to
act as a drop-in replacement for the system compiler. That means we use
libstdc++ (not libc++) by default on targets where it's the default C++
standard library, we use libgcc_s (not compiler-rt) by default on targets
where it's the default compiler runtime, and so on. That strategy has
worked well in getting us to where we are today.

But our needs today aren't the same as they were a few years ago; we don't
need to prove ourselves as much as we used to, and while we should keep
supporting the target defaults for the above components, perhaps it's time
that we start to prefer using LLVM components where available. At the very
least, I don't see a good reason why we would ever want to use libgcc_s in
a situation where compiler-rt (and libunwind) are available -- libgcc_s
does not contain some functions that LLVM implicitly adds calls to.

Perhaps we should have a flag to specify whether we prefer the canonical
tools and components for the target, or whether we prefer LLVM's versions
when available (falling back to the target components if not)?

What would be confusing in this scenario?

That's actually part of a much larger pattern: Clang is currently set up to
act as a drop-in replacement for the system compiler. That means we use
libstdc++ (not libc++) by default on targets where it's the default C++
standard library, we use libgcc_s (not compiler-rt) by default on targets
where it's the default compiler runtime, and so on. That strategy has worked
well in getting us to where we are today.

Agreed! And I still need this very much to work.

But our needs today aren't the same as they were a few years ago; we don't
need to prove ourselves as much as we used to, and while we should keep
supporting the target defaults for the above components, perhaps it's time
that we start to prefer using LLVM components where available.

My feelings exactly.

At the very
least, I don't see a good reason why we would ever want to use libgcc_s in a
situation where compiler-rt (and libunwind) are available -- libgcc_s does
not contain some functions that LLVM implicitly adds calls to.

That's another dead-lock we need to solve, yes. :frowning:

Perhaps we should have a flag to specify whether we prefer the canonical
tools and components for the target, or whether we prefer LLVM's versions
when available (falling back to the target components if not)?

Hum, I had't thought of this...

Do you mean a flag that would (try) to use as much of LLVM as
possible, and fall back to the system's tools when unavailable?

I like the idea very much, but I wonder the kind of spurious bugs
we'll see if one tool suddenly doesn't get built, or change location,
or $PATH changes.

cheers,
--renato

> That's actually part of a much larger pattern: Clang is currently set up
to
> act as a drop-in replacement for the system compiler. That means we use
> libstdc++ (not libc++) by default on targets where it's the default C++
> standard library, we use libgcc_s (not compiler-rt) by default on targets
> where it's the default compiler runtime, and so on. That strategy has
worked
> well in getting us to where we are today.

Agreed! And I still need this very much to work.

> But our needs today aren't the same as they were a few years ago; we
don't
> need to prove ourselves as much as we used to, and while we should keep
> supporting the target defaults for the above components, perhaps it's
time
> that we start to prefer using LLVM components where available.

My feelings exactly.

> At the very
> least, I don't see a good reason why we would ever want to use libgcc_s
in a
> situation where compiler-rt (and libunwind) are available -- libgcc_s
does
> not contain some functions that LLVM implicitly adds calls to.

That's another dead-lock we need to solve, yes. :frowning:

> Perhaps we should have a flag to specify whether we prefer the canonical
> tools and components for the target, or whether we prefer LLVM's versions
> when available (falling back to the target components if not)?

Hum, I had't thought of this...

Do you mean a flag that would (try) to use as much of LLVM as
possible, and fall back to the system's tools when unavailable?

Either that or a flag that specifies to prefer the system's tools, with the
default behavior being to use LLVM's tools when possible.

I like the idea very much, but I wonder the kind of spurious bugs
we'll see if one tool suddenly doesn't get built, or change location,
or $PATH changes.

That problem already exists; we search various paths looking for tools,
libstdc++, and so on. It doesn't seem to be a particularly big deal in
practice, and it's easy to ask Clang to tell you what it actually selected.

Folks,

I'm creating a bootstrap buildbot on AArch64 with LLD and I just
realised the "accepted" way to make clang call lld is to "symlink lld
-> ld". I understand that's how every Linux system "chooses" the
linker, but that makes deployment and validation quite cumbersome on
GNU systems.

I'd like to suggest a change in behaviour:

// Some flag like --linker=<full path / bin on path>
if (LinkerFlag) {
  linker = Flag (linker);

// triple != host
} else if (CROSS_COMPILE) {
  if (LLDSupports(triple))
    linker = Find (LLD);
  if (!linker)
    linker = Find (triple-ld);
  if (!linker)
    ERROR; // *NOT* the system linker!

// triple = host
} else {
  linker = Find (LLD);
  if (!linker)
    linker = Find (SYSLD); // OS-specific
  if (!linker)
    ERROR; // We tried!
}

  Rationale

My reason is that, if you just build Clang, with or without LLD,
everything works out of the box as you expect: Former uses LLD, latter
uses the system's linker. If LLD is able to cross-compile to the
target triple, and it's available, try that. Cross compilers should
never use the system linker, as that's just silly.

However, if you didn't build Clang or LLD and still want to force the
linker (cross when clang gets it wrong, lld installed somewhere else,
some non-sysroot alternative, ld when you have built lld), you'll need
a flag. It doesn't really matter if GCC will ever use LLD, but it
would be good to have Clang be able to specify which linker to use.

We already have library flags, and we don't need an assembler flag, so
the linker seems like the last option missing.

  Use Case

For example, it's perfectly reasonable to have GCC and Clang on the
same system and to have LD and LLD installed / accessible. It's also
perfectly reasonable to have GCC using LD and Clang using LLD on the
same system. Today, that's not possible without changing the path for
Clang and not GCC (cumbersome, to say the least).

The environment above is *exactly* that of any buildbot trying to
bootstrap Clang+LLD using GCC+LD. Iwant to have at least one for
AArch64 and one for ARM, but it would be good to have the same thing
for x86_64, too at the very least.

I don't know much about FreeBSD, but they're moving LLD as the
official linker in multiple platforms and they still have GCC/LD in
ports. There will probably be corner cases...

  Conclusion

I think LLD is mature enough to be preferred over LD on the platforms
it support, if available.

Has anyone done a Debian or Gentoo stress test? If that hasn't been done, I
expect there to be a long tail of bugs that it would be good to squash a
significant part of before risking exposing our users to them. Also, what
is the current status of FreeBSD Poudriere? (Ed, Davide?)
The magic of open source package systems means we can do a large amount of
the bug finding where we might otherwise rely on users; we should take
maximum advantage of that since it is not only easier than getting user bug
reports, but also helps avoid giving users a bad experience.

Since it's not available by default in most of them, its existence
means intention.

But what is their intention? To suddenly use LLD for all their builds that
use Clang? Maybe they have a particular project that happens to have an
annoyingly long final link, and they are only interested in using LLD for
just that project.

This same data point ("existence means intention") could be used to justify
that the user is already intending to use LLD, so they will just manually
add `-fuse-ld=lld` to their link command lines. The suggested change, when
viewed through the assumption that LLD's "existence means intention", is
really just meant to save the user from adding `-fuse-ld=lld` to their
LDFLAGS; and it does so in an arguably somewhat uncontrolled fashion (e.g.
installing an LLD package can cause Clang to change linker in the middle of
an already-running build; or in an already-generated Ninja build dir for a
CMake project). In the same vein as Richard's comment, we don't do that for
e.g. libc++.

Realistically, I think that the best course of action for getting more
people using LLD/ELF right now is:
1. make getting LLD as easy as possible (e.g. make sure it is available in
the main Linux package repos)
2. write a documentation page "how to try out LLD/ELF" that centralizes
various pieces of information, such as how to get LLD/ELF, the use of
-fuse-ld=lld, how to report bugs (e.g. explaining the use of --reproduce),
etc.

Once it becomes available, having it means you should really use it.

Looks like a no-brainer to me. Am I missing something?

We currently have both the FreeBSD and the PlayStation effort underway for
making LLD/ELF a default production system linker. IMO, we should wait for
at least one of those efforts to stabilize before contemplating "flipping
the switch" globally in the way you described. Also, ideally we'll be happy
with the results of some sort of Linux-specific testing such as building
Debian or Gentoo.

-- Sean Silva

Hello Renato,

Thanks very much for raising the topic. I've not got much to add to
what has already been said.

If I understand correctly there are two use cases that we would want
to consider separately:
- Using lld by default when clang is used on a platform such as linux
if it is installed.
- Using lld by default in build-bots and the llvm test-suite when it
is installed.

For the former, personally I think that clang should follow the
conventions of the platform wherever possible, i.e. on Linux ld is a
symlink to ld.XXX which may be ld.lld. I agree with Sean that some
documentation on how to set up and use lld would be very helpful and
would be a relatively cheap first step. Some status information about
how complete each target is would also be useful.

For the latter, I think it would at first be good to have a simple
option, or document options that would use lld. When all targets that
have an lld port can reliably run in the build and test environment we
can consider making it the default if it has been intentionally
installed. I don't have a strong opinion on how to set this up.

Peter

Hi Sean,

First of all, let me be clear: I'm not proposing we "flip the switch
globally" at all.

I'm well aware of the efforts in using LLD as the default linker, and
in no way I'm suggesting we "jump the gun" and derail their efforts.

I was just adding another dimension, orthogonal to their efforts, in
hope more people would see the value of using LLD.

Richard's flag to switch behaviour between LLVM and System is a much
better proposal than mine, and achieves what I wanted to do in a much
nicer way.

The compiler-rt + libgcc_s vs. libunwind is a real problem that has no
nice solution today.

The bugs that we find in old linkers and can't fix because the
platform won't update or GNU won't re-relase is a real problem.

Each of those problems has a solution using compiler flags (the RT one
is not trivial), but all of them together do make Clang hard to use if
you want new functionality.

New uses building Clang/LLVM will *have* to rely on their system
libraries and tools, unless they use a myriad of magic flags, even
though they're building a linker, all necessary libraries, a debugger,
etc.

Making it the default is too big a hammer, but having one simple flag
(ex. --llvm-env or --llvm-tools + --llvm-libs) would make *all* those
problems go away.

On systems that really need to be self-consistent (I'm assuming
Windows), then this could be the default, and users would need to use
flags like --system-env, etc. instead.

cheers,
--renato

For the former, personally I think that clang should follow the
conventions of the platform wherever possible, i.e. on Linux ld is a
symlink to ld.XXX which may be ld.lld. I agree with Sean that some
documentation on how to set up and use lld would be very helpful and
would be a relatively cheap first step. Some status information about
how complete each target is would also be useful.

Following the conventions is the best way forward, but not all
platforms use the ld link trick.

However, what harm would there be if there was a flag to override the
convention?

For the latter, I think it would at first be good to have a simple
option, or document options that would use lld. When all targets that
have an lld port can reliably run in the build and test environment we
can consider making it the default if it has been intentionally
installed. I don't have a strong opinion on how to set this up.

The "simple flag" would be "simpler" if we had a GNU-style
--with-linker=lld when building LLVM.

We don't, so we have to propagate the -fuse-ld=lld flag on every
invocation from then on (ie. test-suite, becnhmarks, etc).

cheers,
--renato

And there is ⚙ D25263 [Driver] Allow setting the default linker during build which aims to add CLANG_DEFAULT_LINKER just as there are CLANG_DEFAULT_CXX_STDLIB and CLANG_DEFAULT_RTLIB to override the platform defaults :wink:

Ditto.

Tho you probably need to different code in order to find the system libraries.

Cheers, Jakob.