Linker selection

Hello the list,

At some point in the near future, FreeBSD is going to start transitioning to a new linker (mclinker, which works now, possibly then migrating to lld once it becomes more complete). People also often install gold or a newer bfd ld from ports. This means that we will have two, possibly three, linkers in the base system and another two from ports, and yet clang currently hard-codes the fact that the linker is the one called ld that is the first to be found in the path. I suspect that, initially, mclinker will be adequate for the base system but possibly not for some ports, so we will need to be able to easily select an alternative linker.

I'd like to add some logic for selecting the linker at run time. Before I start, I was wondering if anyone had any suggestions as to the correct UI for this. As far as I can see, there are two possible choices:

1) We have a flag that takes a full path to the linker to use. This is somewhat problematic, because if two linkers require different arguments (not much of a problem for us, but imagine on Windows being able to switch between the MS and Binutils linkers) then you need to know what the linker is, as well as its name.

2) We have a flag like the libc++ / libstdc++ selection flag, which takes an argument like bfd, gold, mc, lld and some logic in the driver to work out what the paths and names of each of these should be.

Is there a more sensible third option?

David

I think the sensible option is for all the linkers on a particular platform
to have a compatible commandline interface. Much like Clang has a
compatible interface to GCC. This compatibility doesn't have to be perfect
(clearly, as Clang's is far from it), but should easily be enough for the
small variation of Flags Clang needs to pass.

My understanding has been that 'lld' has this as an explicit goal, and
certainly bfd and gold both act as a normal 'ld' linker on Linux. I think
that if MCLinker wants to be used on a platform where 'ld's flag syntax is
the defacto standard, it should be compatible. I think that adding yet
another dialect of linker option syntax for such a narrow use case (as you
admit it is likely to be only a stop-gap) isn't a lot of complexity for
very little gain.

I would much rather see the effort go into making 'lld' (or 'MCLinker' for
that matter...) be a viable, compatible linker on the particular platform
of interest.

I'm already helping the MCLinker folks to make MCLinker an alternative linker for FreeBSD, including implementing the necessary GNU ld-compatible flags. They are 90% done WRT userland, and linking the kernel is next.

What we really need is a more flexible way of specifying which linker the compiler should use internally. Replacing /usr/bin/ld with a symlink is not flexible.

Thanks,
Erik

In particular, we anticipate a fairly long tail of third-party ports that require GNU ld, just as we have a tail that requires libstdc++ and won't work with libc++. The base system will be using libc++ and hopefully will be using mclinker, and so will most ports, but we still don't even have 100% of ports working with clang yet...

As Chandler says, this is a short-term requirement, but short-term in this context means (optimistically) 'the next 3-5 years'.

David

Note that my point was only that we shouldn't need a custom set of command
line flags for mclinker, and we shouldn't need to switch command line flags
when switching linkers while staying on the same target platform. Thus, we
might consider simple solutions that only address the problem of switching
the link binary used rather than a more complex solution which passes flags
in a new dialect.

Also, my "short-term requirement" was using mclinker as opposed to lld -- I
completely agree that the bfd linker will likely be kicking around and in
use for a long time in a few oddball scenarios.

Each linker could have a compatibility layer which translates the command line flags. This compatibility layer could just be a shell script (or similar) supplied by the linker, Clang or third party.

I think the linker should supply it.

I don't think Clang should have to cope with compatibility for every linker
-- coping with compatibility for every platform is hard enough.

I agree - GNU ld flags are the de-facto standard and other linkers need to be compliant.

I believe we're only talking about being able to tell the compiler to use a specific binary as the linker.

Erik

I don't think Clang should have to cope with compatibility for every linker -- coping with compatibility for every platform is hard enough.

I agree - GNU ld flags are the de-facto standard and other linkers need to be compliant.

I also agree this. GNU ld flags are the de-factor standard. Not only
MCLinker, but also Google gold follow GNU ld.
There is a set of flags of GNU ld that linkers must support to emit
correct binary. MCLinker has spent a lot of energy to figure out this
set and has supported all of them.
And I think linkers can have a little bit preference for the rest of
flags, and ignore the flags he does not support.

So far, MCLinker will return an error if he gets an unsupported flag.
But I think in the future releases, MCLinker will not only ignore the
unsupported flags, but also has some new flags for optimizations.

David Chisnall <David.Chisnall@cl.cam.ac.uk> writes:

I'd like to add some logic for selecting the linker at run time.
Before I start, I was wondering if anyone had any suggestions as to
the correct UI for this. As far as I can see, there are two possible
choices:

1) We have a flag that takes a full path to the linker to use. This
is somewhat problematic, because if two linkers require different
arguments (not much of a problem for us, but imagine on Windows being
able to switch between the MS and Binutils linkers) then you need to
know what the linker is, as well as its name.

2) We have a flag like the libc++ / libstdc++ selection flag, which
takes an argument like bfd, gold, mc, lld and some logic in the driver
to work out what the paths and names of each of these should be.

Is there a more sensible third option?

The impression I get from the replies thus far is that (1) should be
sufficient given the assumption that different linkers have mostly
compatible command line interfaces. I think there's some merit to the
second approach even with this assumption though.

In the following the flag to specify the linker is --link-with, for
argument's sake. I'm not particularly attached to that name, but the
name isn't really the point here.

1. If you say --link-with=/full/path/to/ld, we use that binary.
2. If you say --link-with=foo, we first look for the name foo in the way
   it normally looks for ld. Notably, this would mean that if you're
   using a sysroot, we look for $sysroot/usr/bin/foo and friends.
3. If (2) doesn't find anything, we then look for ld.foo, behaving as in
   (2). This handles a fairly common idiom for having multiple
   alternative ld implementations installed.

We then call whatever linker we find with the flags we normally use.
As long as they tend to be compatible, this is fine. What do you guys
think?

David Chisnall <David.Chisnall@cl.cam.ac.uk> writes:
> I'd like to add some logic for selecting the linker at run time.
> Before I start, I was wondering if anyone had any suggestions as to
> the correct UI for this. As far as I can see, there are two possible
> choices:
>
> 1) We have a flag that takes a full path to the linker to use. This
> is somewhat problematic, because if two linkers require different
> arguments (not much of a problem for us, but imagine on Windows being
> able to switch between the MS and Binutils linkers) then you need to
> know what the linker is, as well as its name.
>
> 2) We have a flag like the libc++ / libstdc++ selection flag, which
> takes an argument like bfd, gold, mc, lld and some logic in the driver
> to work out what the paths and names of each of these should be.
>
> Is there a more sensible third option?

The impression I get from the replies thus far is that (1) should be
sufficient given the assumption that different linkers have mostly
compatible command line interfaces. I think there's some merit to the
second approach even with this assumption though.

In the following the flag to specify the linker is --link-with, for
argument's sake. I'm not particularly attached to that name, but the
name isn't really the point here.

1. If you say --link-with=/full/path/to/ld, we use that binary.

IMO, this should be sysroot-relative, not root relative. But I'm interested
if others disagree, and why. I can imagine problems with it, but not how
realistic such problems would be.

2. If you say --link-with=foo, we first look for the name foo in the way
   it normally looks for ld. Notably, this would mean that if you're
   using a sysroot, we look for $sysroot/usr/bin/foo and friends.
3. If (2) doesn't find anything, we then look for ld.foo, behaving as in
   (2). This handles a fairly common idiom for having multiple
   alternative ld implementations installed.

This seems excellent.

We then call whatever linker we find with the flags we normally use.

I don't think we should go this far.

If the user gave us a '--link-with' flag, I think we should honor it or
produce an error message explaining that we were unable to honor it.

The final question in my mind is whether "--link-with" is the right name
for this flag. I would be interested if anyone has particular shades of
paint they prefer for this bikeshed. I would also be interested if someone
familiar with the GCC community could see if they want to support this
logic as well, and if so coordinate the flag name and semantics with them
so we end up with a compatible option in both GCC and Clang.

-Chandler

What about "--linker"? It's seems fairly obvious to me if it's not already occupied. Or could that be confused with passing flags to the linker.

  1. If you say --link-with=/full/path/to/ld, we use that binary.

IMO, this should be sysroot-relative, not root relative. But I’m interested if others disagree, and why. I can imagine problems with it, but not how realistic such problems would be.

I do have some concerns with this, though I think this is the right default. I have a sysroot that is not fully setup. There is a ld, but it doesn’t run. There is probably an easy fix; but since the system ld works just fine (at least we have yet to have an unexplained bug),and is a newer version anyway we just use that. Basically our target is x86 so there isn’t any motivation to do more with sysroot than put the libraries that are different on the target.

Again, I think sysroot-relative is the right default. Just give me

–the-guys-who-setup-sysroot-our-idiots-link-with=

as an alternative. Or some variation, I’m color blind so it doesn’t matter to me what shade you paint it.

The gcc use -fuse-ld option, however the patch are under review.

http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02389.html

gcc invoke `ld` if the linker not specify and use `ld.xxx` (`ld.bfd`
or `ld.gold`) if given -fuse-ld.

It doesn't make sense for a cross-compiling environment.

Joerg

3) We use -B to select the directory from which ld should be picked.

Joerg

Both are likely to be installed in /usr/bin, so this doesn't help.

David

Chandler Carruth <chandlerc@google.com> writes:

    We then call whatever linker we find with the flags we normally use.

I don't think we should go this far.

If the user gave us a '--link-with' flag, I think we should honor it or produce
an error message explaining that we were unable to honor it.

Sorry, my wording wasn't clear here. I agree with you. I was just trying
to reiterate that, once we find a linker using the rules above, we
shouldn't try to differentiate between linker types or special case how
different linkers are called.