[RFC] Recognize -fuse-ld={bfd, gold, lld} but don't prepend "ld." or "ld64." for other values

In https://reviews.llvm.org/D80225 I intended to update the semantics of -fuse-ld=
Some context (from my archaeology) about the option

* 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
* rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
* D78290 changed the prefix to ld64. on Darwin.
* GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
   behavior change here will not be incompatible with GCC).

Our existing behavior is cumbersome:

* Inconsistency between an absolute
   path (which does not get a ld. prefix) and a relative path (which gets
   a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
   access x86_64-unknown-linux-gnu-ld.dir/foo (if
   LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
* lld's new Mach-O port needs to pick an executable name. It would be
   nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by `ld64.`

I suggest we just hard code the currently used values which intend to get
a prefix: bfd, gold, lld. For all other values, don't add a prefix.
(PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)

Thoughts?

In https://reviews.llvm.org/D80225 I intended to update the semantics of -fuse-ld=
Some context (from my archaeology) about the option

  • https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55470 added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
  • rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
  • D78290 changed the prefix to ld64. on Darwin.
  • GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
    behavior change here will not be incompatible with GCC).

Our existing behavior is cumbersome:

  • Inconsistency between an absolute
    path (which does not get a ld. prefix) and a relative path (which gets
    a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
    access x86_64-unknown-linux-gnu-ld.dir/foo (if
    LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
  • lld’s new Mach-O port needs to pick an executable name. It would be
    nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by ld64.

To clarify (it seems like the link between “Mach-O lld needs to pick an executable name” and “it would be nice if -fuse-ld accepts a program name” isn’t quite spelled out): By making this change to -fuse-ld it would enable the Mach-O version of lld to be called “lld” rather than “ld64.lld”? Is that the goal?

Is Mach-O lld likely to need different command line compatibility than ELF or COFF lld? It seems like having a distinct name to enable detection of what command line support/object file format assumptions/other behavior changes compared to ELF/COFF may be useful, and “ld64.lld” is probably as good a name as any, if that functionality is desirable?

In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of
-fuse-ld=
Some context (from my archaeology) about the option

* 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
-fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
* rL194328 ported the feature and made -fuse-ld= available for other
values (with the ld. prefix)
* D78290 changed the prefix to ld64. on Darwin.
* GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is
allowed (so our
   behavior change here will not be incompatible with GCC).

Our existing behavior is cumbersome:

* Inconsistency between an absolute
   path (which does not get a ld. prefix) and a relative path (which gets
   a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
   access x86_64-unknown-linux-gnu-ld.dir/foo (if
   LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
* lld's new Mach-O port needs to pick an executable name. It would be
   nice if -fuse-ld= simply accepts a program name, not necessarily
prefixed by `ld64.`

To clarify (it seems like the link between "Mach-O lld needs to pick an
executable name" and "it would be nice if -fuse-ld accepts a program name"
isn't quite spelled out): By making this change to -fuse-ld it would enable
the Mach-O version of lld to be called "lld" rather than "ld64.lld"? Is
that the goal?

The Mach-O port can be named whatever developers like, no need to
coupled with a `ld64.` prefix.

I do think of whether -fuse-ld=lld means different things would be a
nice property: ld.lld (ELF), wasm-ld (wasm), lld-link (COFF) in
different contexts. I realized it is not. The linker options on
different binary formats are very different. Targeting different binary
formats requires one to customize the linker options anyway, a common
-fuse-ld=lld does not save much.

(In 2010, it might be cleaner if binutils did not install ld.bfd -> ld
and ld.gold -> gold ...... They could juse use -fuse-ld=ld to mean "ld"
and -fuse-ld=gold to mean "gold")

I think the behavior change you're proposing makes sense, but I also want to point out that our current plan for the new Mach-O port is to name it ld64.lld as well, to replace the old one and prevent any confusion. We're certainly open to a different name though if people can think of something nicer.

As for why we haven't made the ld64.lld switch already, we were waiting for the new Mach-O port to be a bit more feature complete before doing that, since I believe there are some people using the old port still. We were targeting self-hosting as the milestone after which we'd switch over.

    >
    >> In https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D80225&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=o3kDXzdBUE3ljQXKeTWOMw&m=i8VMYJJ0XWBjnRh78QYx9fwr6fSlLBEMkPjsGRIDbDU&s=CtEvErJQu5F--emEtzW9JeavMEr3T_-5RkEZ9H3mcmQ&e= I intended to update the semantics of
    >> -fuse-ld=
    >> Some context (from my archaeology) about the option
    >>
    >> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
    >> -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
    >> * rL194328 ported the feature and made -fuse-ld= available for other
    >> values (with the ld. prefix)
    >> * D78290 changed the prefix to ld64. on Darwin.
    >> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is
    >> allowed (so our
    >> behavior change here will not be incompatible with GCC).
    >>
    >> Our existing behavior is cumbersome:
    >>
    >> * Inconsistency between an absolute
    >> path (which does not get a ld. prefix) and a relative path (which gets
    >> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
    >> access x86_64-unknown-linux-gnu-ld.dir/foo (if
    >> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
    >> * lld's new Mach-O port needs to pick an executable name. It would be
    >> nice if -fuse-ld= simply accepts a program name, not necessarily
    >> prefixed by `ld64.`
    >>
    >
    >To clarify (it seems like the link between "Mach-O lld needs to pick an
    >executable name" and "it would be nice if -fuse-ld accepts a program name"
    >isn't quite spelled out): By making this change to -fuse-ld it would enable
    >the Mach-O version of lld to be called "lld" rather than "ld64.lld"? Is
    >that the goal?

    The Mach-O port can be named whatever developers like, no need to
    coupled with a `ld64.` prefix.

    I do think of whether -fuse-ld=lld means different things would be a
    nice property: ld.lld (ELF), wasm-ld (wasm), lld-link (COFF) in
    different contexts. I realized it is not. The linker options on
    different binary formats are very different. Targeting different binary
    formats requires one to customize the linker options anyway, a common
    -fuse-ld=lld does not save much.

    (In 2010, it might be cleaner if binutils did not install ld.bfd -> ld
    and ld.gold -> gold ...... They could juse use -fuse-ld=ld to mean "ld"
    and -fuse-ld=gold to mean "gold")

Adding the folks from the code review

>
>> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of
>> -fuse-ld=
>> Some context (from my archaeology) about the option
>>
>> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
>> -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>> * rL194328 ported the feature and made -fuse-ld= available for other
>> values (with the ld. prefix)
>> * D78290 changed the prefix to ld64. on Darwin.
>> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is
>> allowed (so our
>> behavior change here will not be incompatible with GCC).
>>
>> Our existing behavior is cumbersome:
>>
>> * Inconsistency between an absolute
>> path (which does not get a ld. prefix) and a relative path (which gets
>> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>> * lld's new Mach-O port needs to pick an executable name. It would be
>> nice if -fuse-ld= simply accepts a program name, not necessarily
>> prefixed by `ld64.`
>>
>
>To clarify (it seems like the link between "Mach-O lld needs to pick an
>executable name" and "it would be nice if -fuse-ld accepts a program name"
>isn't quite spelled out): By making this change to -fuse-ld it would enable
>the Mach-O version of lld to be called "lld" rather than "ld64.lld"? Is
>that the goal?

The Mach-O port can be named whatever developers like, no need to
coupled with a `ld64.` prefix.

Given the needs of a multiplatform/busyboxed linker like lld, it seems
likely that any future multiplatform linker would probably use this
same sort of prefix going forward.

I do think of whether -fuse-ld=lld means different things would be a
nice property: ld.lld (ELF), wasm-ld (wasm), lld-link (COFF) in
different contexts. I realized it is not. The linker options on
different binary formats are very different. Targeting different binary
formats requires one to customize the linker options anyway, a common
-fuse-ld=lld does not save much.

Yep, doesn't seem likely - is the intent to use "-fuse-ld" on non-ELF
platforms and make it easier/possible to write "-fuse-ld=lld-link"
instead of having to provide the full path? (or does that already
work/does using a non-ELF platform already disable the "ld."
prefixing?)

(In 2010, it might be cleaner if binutils did not install ld.bfd -> ld
and ld.gold -> gold ...... They could juse use -fuse-ld=ld to mean "ld"
and -fuse-ld=gold to mean "gold")

Sure enough - though dealing with where we are today is trickier :confused:

Breaking existing users that depend on this generalized prefixing
seems significant - Sony's the best known/most likely to hit this that
I know of. Hopefully Paul Robinson can speak to that.

If GCC is willing to make the same change, I don't feel too strongly
about this - beyond the existing user breakage (which isn't an
insignificant concern - breaking flag compatibility is something best
avoided) the only future problem I could see (& given we're only
currently living with 3 linkers, this isn't a huge deal) is that if
another linker comes along there will be a subtle difference between
"-fuse-ld=lld" and "-fuse-ld=newlink", at least because a user might
be confused about exactly which binary is being used, in the former
case it's "ld.lld" and in the latter it's "newlink". On top of that if
that if "newlink" busyboxes, and perhaps the original name "newlink"
either defaults to one platform (say MachO mode, like ld64.lld), but
if you want to link for MachO, and busyboxes "ld.newlink" to provide
the ld-compatible version - now you have to write
"-fuse-ld=ld.newlink", but you don't have to do that with gold/etc...

Eh, such is life, perhaps.

If the desire is just to support relative paths, that could be
implemented by trying the prefix version - and if that fails, then
trying it as a relative path? Though I can see value in sort of
deprecating/baking in the prefix support only for the existing users
and forcing all new usage to use the actual executable name.

- Dave

Adding the folks from the code review

>
>> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of
>> -fuse-ld=
>> Some context (from my archaeology) about the option
>>
>> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
>> -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>> * rL194328 ported the feature and made -fuse-ld= available for other
>> values (with the ld. prefix)
>> * D78290 changed the prefix to ld64. on Darwin.
>> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is
>> allowed (so our
>> behavior change here will not be incompatible with GCC).
>>
>> Our existing behavior is cumbersome:
>>
>> * Inconsistency between an absolute
>> path (which does not get a ld. prefix) and a relative path (which gets
>> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>> * lld's new Mach-O port needs to pick an executable name. It would be
>> nice if -fuse-ld= simply accepts a program name, not necessarily
>> prefixed by `ld64.`
>>
>
>To clarify (it seems like the link between "Mach-O lld needs to pick an
>executable name" and "it would be nice if -fuse-ld accepts a program name"
>isn't quite spelled out): By making this change to -fuse-ld it would enable
>the Mach-O version of lld to be called "lld" rather than "ld64.lld"? Is
>that the goal?

The Mach-O port can be named whatever developers like, no need to
coupled with a `ld64.` prefix.

Given the needs of a multiplatform/busyboxed linker like lld, it seems
likely that any future multiplatform linker would probably use this
same sort of prefix going forward.

I do think of whether -fuse-ld=lld means different things would be a
nice property: ld.lld (ELF), wasm-ld (wasm), lld-link (COFF) in
different contexts. I realized it is not. The linker options on
different binary formats are very different. Targeting different binary
formats requires one to customize the linker options anyway, a common
-fuse-ld=lld does not save much.

Yep, doesn't seem likely - is the intent to use "-fuse-ld" on non-ELF
platforms and make it easier/possible to write "-fuse-ld=lld-link"
instead of having to provide the full path? (or does that already
work/does using a non-ELF platform already disable the "ld."
prefixing?)

(In 2010, it might be cleaner if binutils did not install ld.bfd -> ld
and ld.gold -> gold ...... They could juse use -fuse-ld=ld to mean "ld"
and -fuse-ld=gold to mean "gold")

Sure enough - though dealing with where we are today is trickier :confused:

Breaking existing users that depend on this generalized prefixing
seems significant - Sony's the best known/most likely to hit this that
I know of. Hopefully Paul Robinson can speak to that.

clang::driver::tools::PS4cpu has its own logic

It is not affected, but hope Paul can confirm.

If GCC is willing to make the same change, I don't feel too strongly
about this - beyond the existing user breakage (which isn't an
insignificant concern - breaking flag compatibility is something best
avoided) the only future problem I could see (& given we're only
currently living with 3 linkers, this isn't a huge deal) is that if
another linker comes along there will be a subtle difference between
"-fuse-ld=lld" and "-fuse-ld=newlink", at least because a user might
be confused about exactly which binary is being used, in the former
case it's "ld.lld" and in the latter it's "newlink". On top of that if
that if "newlink" busyboxes, and perhaps the original name "newlink"
either defaults to one platform (say MachO mode, like ld64.lld), but
if you want to link for MachO, and busyboxes "ld.newlink" to provide
the ld-compatible version - now you have to write
"-fuse-ld=ld.newlink", but you don't have to do that with gold/etc...

GCC does not support -fuse-ld=relative/path or -fuse-ld=/abs/path or
-fuse-ld=word_not_bfd_gold_lld

I do have a patch to make it have a D80225 compatible behavior
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546277.html

Eh, such is life, perhaps.

If the desire is just to support relative paths, that could be
implemented by trying the prefix version - and if that fails, then
trying it as a relative path? Though I can see value in sort of
deprecating/baking in the prefix support only for the existing users
and forcing all new usage to use the actual executable name.

- Dave

A relative path is supported. The desire is to eliminate differences
between "filename" and "relative/filename". The platform specific prefix
rule (well, sometimes suffix, e.g. lld-link) is also hard to remember.

"trying it as a relative path" can gratuitously add more `access`
syscalls. I initially tried this idea for my GCC patch. Martin Liška pointed out
the problem.

(Note, we still have such a problem
access("/tmp/Debug/bin/x86_64-unknown-linux-gnu-ld.dir/foo", R_OK|X_OK) = -1 ENOENT (No such file or directory)
target triple prefixes may not make lots of sense for clang tools
)

>Breaking existing users that depend on this generalized prefixing
>seems significant - Sony's the best known/most likely to hit this that
>I know of. Hopefully Paul Robinson can speak to that.

clang::driver::tools::PS4cpu has its own logic
https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/PS4CPU.cpp#L341
It is not affected, but hope Paul can confirm.

Sony uses its own naming convention, we're not affected by this discussion.
--paulr

>Adding the folks from the code review
>
>>
>> >
>> >> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of
>> >> -fuse-ld=
>> >> Some context (from my archaeology) about the option
>> >>
>> >> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
>> >> -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>> >> * rL194328 ported the feature and made -fuse-ld= available for other
>> >> values (with the ld. prefix)
>> >> * D78290 changed the prefix to ld64. on Darwin.
>> >> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is
>> >> allowed (so our
>> >> behavior change here will not be incompatible with GCC).
>> >>
>> >> Our existing behavior is cumbersome:
>> >>
>> >> * Inconsistency between an absolute
>> >> path (which does not get a ld. prefix) and a relative path (which gets
>> >> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>> >> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>> >> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>> >> * lld's new Mach-O port needs to pick an executable name. It would be
>> >> nice if -fuse-ld= simply accepts a program name, not necessarily
>> >> prefixed by `ld64.`
>> >>
>> >
>> >To clarify (it seems like the link between "Mach-O lld needs to pick an
>> >executable name" and "it would be nice if -fuse-ld accepts a program name"
>> >isn't quite spelled out): By making this change to -fuse-ld it would enable
>> >the Mach-O version of lld to be called "lld" rather than "ld64.lld"? Is
>> >that the goal?
>>
>> The Mach-O port can be named whatever developers like, no need to
>> coupled with a `ld64.` prefix.
>
>Given the needs of a multiplatform/busyboxed linker like lld, it seems
>likely that any future multiplatform linker would probably use this
>same sort of prefix going forward.
>> I do think of whether -fuse-ld=lld means different things would be a
>> nice property: ld.lld (ELF), wasm-ld (wasm), lld-link (COFF) in
>> different contexts. I realized it is not. The linker options on
>> different binary formats are very different. Targeting different binary
>> formats requires one to customize the linker options anyway, a common
>> -fuse-ld=lld does not save much.
>
>Yep, doesn't seem likely - is the intent to use "-fuse-ld" on non-ELF
>platforms and make it easier/possible to write "-fuse-ld=lld-link"
>instead of having to provide the full path? (or does that already
>work/does using a non-ELF platform already disable the "ld."
>prefixing?)
>
>> (In 2010, it might be cleaner if binutils did not install ld.bfd -> ld
>> and ld.gold -> gold ...... They could juse use -fuse-ld=ld to mean "ld"
>> and -fuse-ld=gold to mean "gold")
>
>Sure enough - though dealing with where we are today is trickier :confused:
>
>Breaking existing users that depend on this generalized prefixing
>seems significant - Sony's the best known/most likely to hit this that
>I know of. Hopefully Paul Robinson can speak to that.

clang::driver::tools::PS4cpu has its own logic
https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/PS4CPU.cpp#L341
It is not affected, but hope Paul can confirm.

>If GCC is willing to make the same change, I don't feel too strongly
>about this - beyond the existing user breakage (which isn't an
>insignificant concern - breaking flag compatibility is something best
>avoided) the only future problem I could see (& given we're only
>currently living with 3 linkers, this isn't a huge deal) is that if
>another linker comes along there will be a subtle difference between
>"-fuse-ld=lld" and "-fuse-ld=newlink", at least because a user might
>be confused about exactly which binary is being used, in the former
>case it's "ld.lld" and in the latter it's "newlink". On top of that if
>that if "newlink" busyboxes, and perhaps the original name "newlink"
>either defaults to one platform (say MachO mode, like ld64.lld), but
>if you want to link for MachO, and busyboxes "ld.newlink" to provide
>the ld-compatible version - now you have to write
>"-fuse-ld=ld.newlink", but you don't have to do that with gold/etc...

GCC does not support -fuse-ld=relative/path or -fuse-ld=/abs/path or
-fuse-ld=word_not_bfd_gold_lld

I do have a patch to make it have a D80225 compatible behavior
[PATCH v3] Add -fuse-ld= to specify an arbitrary executable as the linker

>Eh, such is life, perhaps.
>
>If the desire is just to support relative paths, that could be
>implemented by trying the prefix version - and if that fails, then
>trying it as a relative path? Though I can see value in sort of
>deprecating/baking in the prefix support only for the existing users
>and forcing all new usage to use the actual executable name.
>
>- Dave

A relative path is supported.

Oh, I thought when you said "For a relative path, -fuse-ld=dir/foo
currently tries to access x86_64-unknown-linux-gnu-ld.dir/foo (if
LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu)."

You meant that was the binary it would run and fail - but you don't
mean that, you mean that clang checks if that binary exists, then if
it doesn't, it falls back to using "dir/foo" without a prefix? So it
does what the user expected?

The desire is to eliminate differences
between "filename" and "relative/filename". The platform specific prefix
rule (well, sometimes suffix, e.g. lld-link) is also hard to remember.

Given you don't /have/ to remember it, that doesn't seem like a
significant cost to users?

"trying it as a relative path" can gratuitously add more `access`
syscalls. I initially tried this idea for my GCC patch. Martin Liška pointed out
the problem.

(Note, we still have such a problem
access("/tmp/Debug/bin/x86_64-unknown-linux-gnu-ld.dir/foo", R_OK|X_OK) = -1 ENOENT (No such file or directory)
target triple prefixes may not make lots of sense for clang tools
)

I don't think the extra 'stat' call is particularly
significant/important to avoid - is there reason to believe it's
especially costly or problematic compared to the totality of
compiling/linking a program?

So I'm now left wondering what problem is being fixed here - it sounds
like the current LLVM implementation provides the prefix support where
it works, and otherwise provides all the other support. The change
you're proposing is to remove prefix support from all but the 3 common
unix linkers - to remove one extra stat call when linking? Does that
sum up the change?

>Adding the folks from the code review
>
>>
>> >
>> >> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of
>> >> -fuse-ld=
>> >> Some context (from my archaeology) about the option
>> >>
>> >> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
>> >> -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>> >> * rL194328 ported the feature and made -fuse-ld= available for other
>> >> values (with the ld. prefix)
>> >> * D78290 changed the prefix to ld64. on Darwin.
>> >> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is
>> >> allowed (so our
>> >> behavior change here will not be incompatible with GCC).
>> >>
>> >> Our existing behavior is cumbersome:
>> >>
>> >> * Inconsistency between an absolute
>> >> path (which does not get a ld. prefix) and a relative path (which gets
>> >> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>> >> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>> >> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>> >> * lld's new Mach-O port needs to pick an executable name. It would be
>> >> nice if -fuse-ld= simply accepts a program name, not necessarily
>> >> prefixed by `ld64.`
>> >>
>> >
>> >To clarify (it seems like the link between "Mach-O lld needs to pick an
>> >executable name" and "it would be nice if -fuse-ld accepts a program name"
>> >isn't quite spelled out): By making this change to -fuse-ld it would enable
>> >the Mach-O version of lld to be called "lld" rather than "ld64.lld"? Is
>> >that the goal?
>>
>> The Mach-O port can be named whatever developers like, no need to
>> coupled with a `ld64.` prefix.
>
>Given the needs of a multiplatform/busyboxed linker like lld, it seems
>likely that any future multiplatform linker would probably use this
>same sort of prefix going forward.
>> I do think of whether -fuse-ld=lld means different things would be a
>> nice property: ld.lld (ELF), wasm-ld (wasm), lld-link (COFF) in
>> different contexts. I realized it is not. The linker options on
>> different binary formats are very different. Targeting different binary
>> formats requires one to customize the linker options anyway, a common
>> -fuse-ld=lld does not save much.
>
>Yep, doesn't seem likely - is the intent to use "-fuse-ld" on non-ELF
>platforms and make it easier/possible to write "-fuse-ld=lld-link"
>instead of having to provide the full path? (or does that already
>work/does using a non-ELF platform already disable the "ld."
>prefixing?)
>
>> (In 2010, it might be cleaner if binutils did not install ld.bfd -> ld
>> and ld.gold -> gold ...... They could juse use -fuse-ld=ld to mean "ld"
>> and -fuse-ld=gold to mean "gold")
>
>Sure enough - though dealing with where we are today is trickier :confused:
>
>Breaking existing users that depend on this generalized prefixing
>seems significant - Sony's the best known/most likely to hit this that
>I know of. Hopefully Paul Robinson can speak to that.

clang::driver::tools::PS4cpu has its own logic
https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/PS4CPU.cpp#L341
It is not affected, but hope Paul can confirm.

>If GCC is willing to make the same change, I don't feel too strongly
>about this - beyond the existing user breakage (which isn't an
>insignificant concern - breaking flag compatibility is something best
>avoided) the only future problem I could see (& given we're only
>currently living with 3 linkers, this isn't a huge deal) is that if
>another linker comes along there will be a subtle difference between
>"-fuse-ld=lld" and "-fuse-ld=newlink", at least because a user might
>be confused about exactly which binary is being used, in the former
>case it's "ld.lld" and in the latter it's "newlink". On top of that if
>that if "newlink" busyboxes, and perhaps the original name "newlink"
>either defaults to one platform (say MachO mode, like ld64.lld), but
>if you want to link for MachO, and busyboxes "ld.newlink" to provide
>the ld-compatible version - now you have to write
>"-fuse-ld=ld.newlink", but you don't have to do that with gold/etc...

GCC does not support -fuse-ld=relative/path or -fuse-ld=/abs/path or
-fuse-ld=word_not_bfd_gold_lld

I do have a patch to make it have a D80225 compatible behavior
[PATCH v3] Add -fuse-ld= to specify an arbitrary executable as the linker

>Eh, such is life, perhaps.
>
>If the desire is just to support relative paths, that could be
>implemented by trying the prefix version - and if that fails, then
>trying it as a relative path? Though I can see value in sort of
>deprecating/baking in the prefix support only for the existing users
>and forcing all new usage to use the actual executable name.
>
>- Dave

A relative path is supported.

Oh, I thought when you said "For a relative path, -fuse-ld=dir/foo
currently tries to access x86_64-unknown-linux-gnu-ld.dir/foo (if
LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu)."

You meant that was the binary it would run and fail - but you don't
mean that, you mean that clang checks if that binary exists, then if
it doesn't, it falls back to using "dir/foo" without a prefix? So it
does what the user expected?

clang checks if x86_64-unknown-linux-gnu-ld.${fuse_ld} exists. With
this patch, clang still checks x86_64-unknown-linux-gnu-${fuse_ld} (we
just removed an arbitrary prefix). We shall think whether this behavior
makes sense and remove it if it doesn't. This can be done as a
follow-up.

The desire is to eliminate differences
between "filename" and "relative/filename". The platform specific prefix
rule (well, sometimes suffix, e.g. lld-link) is also hard to remember.

Given you don't /have/ to remember it, that doesn't seem like a
significant cost to users?

"trying it as a relative path" can gratuitously add more `access`
syscalls. I initially tried this idea for my GCC patch. Martin Liška pointed out
the problem.

(Note, we still have such a problem
access("/tmp/Debug/bin/x86_64-unknown-linux-gnu-ld.dir/foo", R_OK|X_OK) = -1 ENOENT (No such file or directory)
target triple prefixes may not make lots of sense for clang tools
)

I don't think the extra 'stat' call is particularly
significant/important to avoid - is there reason to believe it's
especially costly or problematic compared to the totality of
compiling/linking a program?

So I'm now left wondering what problem is being fixed here - it sounds
like the current LLVM implementation provides the prefix support where
it works, and otherwise provides all the other support. The change
you're proposing is to remove prefix support from all but the 3 common
unix linkers - to remove one extra stat call when linking? Does that
sum up the change?

Advantages:

* To drop an arbitrary prefix/suffix on some platforms.
   For lld in particular, we can pick a better name if people don't like
   the "ld64." prefix
* To make the rule more consistent. -fuse-ld={bfd,gold,lld} are
   hard-coded (unfortunate historical behavior), otherwise consistent.

It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we’re going to have such conditionals, GCC’s behavior of allowing only certain enumerated values would seem to me to be better.

However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)

And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings “ps4” or “gold”, so there’s no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it’s rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not…

In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of -fuse-ld=
Some context (from my archaeology) about the option

* 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
* rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
* D78290 changed the prefix to ld64. on Darwin.
* GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
   behavior change here will not be incompatible with GCC).

Our existing behavior is cumbersome:

* Inconsistency between an absolute
   path (which does not get a ld. prefix) and a relative path (which gets
   a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
   access x86_64-unknown-linux-gnu-ld.dir/foo (if
   LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
* lld's new Mach-O port needs to pick an executable name. It would be
   nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by `ld64.`

I suggest we just hard code the currently used values which intend to get
a prefix: bfd, gold, lld. For all other values, don't add a prefix.
(PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)

Thoughts?

It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we're going to have such conditionals, GCC's behavior of allowing only certain enumerated values would seem to me to be better.

However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)

Sounds like fixing those is independent of the change you're
proposing, though? I'm not sure I see it as related to this proposal.

And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings "ps4" or "gold", so there's no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it's rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not...

Not sure I follow these pieces. Re: PS4, it's good to know it's not an
issue/wouldn't be affected by this change. But that doesn't mean other
users out there aren't benefiting from/relying on the ld. prefix -
it's functionality LLVM has shipped, users can use, etc, and we would
be changing that out from under them which I think is always something
that should be considered carefully as to what we gain by changing
user-facing behavior.

Windows: Not sure what part of this discussion would motivate changing
windows to disallow the arbitrary linker path compared to allowing it
on Linux.

- Dave

In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of -fuse-ld=
Some context (from my archaeology) about the option

* 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
* rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
* D78290 changed the prefix to ld64. on Darwin.
* GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
   behavior change here will not be incompatible with GCC).

Our existing behavior is cumbersome:

* Inconsistency between an absolute
   path (which does not get a ld. prefix) and a relative path (which gets
   a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
   access x86_64-unknown-linux-gnu-ld.dir/foo (if
   LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
* lld's new Mach-O port needs to pick an executable name. It would be
   nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by `ld64.`

I suggest we just hard code the currently used values which intend to get
a prefix: bfd, gold, lld. For all other values, don't add a prefix.
(PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)

Thoughts?

It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we're going to have such conditionals, GCC's behavior of allowing only certain enumerated values would seem to me to be better.

However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)

Sounds like fixing those is independent of the change you're
proposing, though? I'm not sure I see it as related to this proposal.

It is independent.

And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings "ps4" or "gold", so there's no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it's rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not...

Not sure I follow these pieces. Re: PS4, it's good to know it's not an
issue/wouldn't be affected by this change. But that doesn't mean other
users out there aren't benefiting from/relying on the ld. prefix -
it's functionality LLVM has shipped, users can use, etc, and we would
be changing that out from under them which I think is always something
that should be considered carefully as to what we gain by changing
user-facing behavior.

If they use a relative path, they can't be benefiting from ld. prefix

With the magic ld. prefix,

   -fuse-ld=path1/path2/foo -> ld.path1/path2/foo

If they use an absolute path, they are unaffected.

If they use a bare word, e.g. -fuse-ld=meow , they need to change it to
-fuse-ld=ld.meow now.

(1) GCC does not support arbitrary -fuse-ld=bare-word
(2) absolute/relative paths are not affected
(3) bare word can easily be adapted

=> conclusion: if this (ever) affects any user, the troublesome should be small enough.

The RFC serves the purpose seeking for any such (proprietary linker)
user. Such objection hasn't emerged yet. If they don't speak up, I am
not sure we should wait indefinitely making -fuse-ld= sane and
consistent.

>>
>>
>>
>>>
>>> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of -fuse-ld=
>>> Some context (from my archaeology) about the option
>>>
>>> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>>> * rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
>>> * D78290 changed the prefix to ld64. on Darwin.
>>> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
>>> behavior change here will not be incompatible with GCC).
>>>
>>> Our existing behavior is cumbersome:
>>>
>>> * Inconsistency between an absolute
>>> path (which does not get a ld. prefix) and a relative path (which gets
>>> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>>> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>>> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>>> * lld's new Mach-O port needs to pick an executable name. It would be
>>> nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by `ld64.`
>>>
>>> I suggest we just hard code the currently used values which intend to get
>>> a prefix: bfd, gold, lld. For all other values, don't add a prefix.
>>> (PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)
>>>
>>> Thoughts?
>>
>>
>> It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we're going to have such conditionals, GCC's behavior of allowing only certain enumerated values would seem to me to be better.
>>
>> However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)
>
>Sounds like fixing those is independent of the change you're
>proposing, though? I'm not sure I see it as related to this proposal.

It is independent.

>> And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings "ps4" or "gold", so there's no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it's rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not...
>
>Not sure I follow these pieces. Re: PS4, it's good to know it's not an
>issue/wouldn't be affected by this change. But that doesn't mean other
>users out there aren't benefiting from/relying on the ld. prefix -
>it's functionality LLVM has shipped, users can use, etc, and we would
>be changing that out from under them which I think is always something
>that should be considered carefully as to what we gain by changing
>user-facing behavior.

If they use a relative path, they can't be benefiting from ld. prefix

With the magic ld. prefix,

   -fuse-ld=path1/path2/foo -> ld.path1/path2/foo

OK, I'm confused - it seems we've gone back and forth on this, or at
least I have, in understanding what happens with relative paths.

You're saying, currently relative paths always get the prefix? Or does
clang test if "ld.path1/path2/foo" exists, and if it doesn't exist it
falls back to "path1/path2/foo"? I'd say that's "fairly supported",
but not ideal - we could test if the path has any path separators in
it, then never use the prefix if it does. That way it doesn't
accidentally pick up a weird path if it does exist. Yes, this is a
break in functionality for some potential users - but I think the
possibility of a user intentionally using "ld.path1/path2/foo" is
small compared to the possibility of a user using "ld.foo"
intentionally (since that's already shown as intentional with bfd,
gold, and lld - so a user might see that and add their own in a
similar way).

If they use an absolute path, they are unaffected.

If they use a bare word, e.g. -fuse-ld=meow , they need to change it to
-fuse-ld=ld.meow now.

(1) GCC does not support arbitrary -fuse-ld=bare-word
(2) absolute/relative paths are not affected

^ this confuses me. You said above that " -fuse-ld=path1/path2/foo ->
ld.path1/path2/foo" is happening, so removing the prefix seems like it
would affect relative paths. But you're saying it won't?

(3) bare word can easily be adapted

=> conclusion: if this (ever) affects any user, the troublesome should be small enough.

My concern would be mostly for users where it might silently pick a
different linker or get different linker semantics. Yes, that's
probably a fairly small set of users, if any. (a user with "ld.foo"
being a different version than "foo" would be surprising, and a user
with "foo" providing different semantics than "ld.foo" might be more
likely but still small (a busyboxed linker that doesn't default to the
semantics of the system they're on - eg: imagine if "lld2" did COFF By
default, but did ELF when using "ld.lld2" - changing "-fuse-ld=lld2"
from running "ld.lld2" to running COFF lld would be pretty
surprising/confusing/etc)) But, yes, if you got the wrong linker
semantics, it'd probably be broken enough you wouldn't get far/would
have to go figure out what went wrong and add the "ld." to your
-fuse-ld command.

The RFC serves the purpose seeking for any such (proprietary linker)
user. Such objection hasn't emerged yet. If they don't speak up, I am
not sure we should wait indefinitely making -fuse-ld= sane and
consistent.

I don't think it's not sane, really.

And there are many more LLVM users than read these mailing lists,
unfortunately - some amount of "this feature is in the wild and may be
used by any number of users in interesting ways" is something we
should consider both when creating features and changing them.

- David

>>
>>>
>>> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of -fuse-ld=
>>> Some context (from my archaeology) about the option
>>>
>>> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>>> * rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
>>> * D78290 changed the prefix to ld64. on Darwin.
>>> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
>>> behavior change here will not be incompatible with GCC).
>>>
>>> Our existing behavior is cumbersome:
>>>
>>> * Inconsistency between an absolute
>>> path (which does not get a ld. prefix) and a relative path (which gets
>>> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>>> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>>> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>>> * lld's new Mach-O port needs to pick an executable name. It would be
>>> nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by `ld64.`
>>>
>>> I suggest we just hard code the currently used values which intend to get
>>> a prefix: bfd, gold, lld. For all other values, don't add a prefix.
>>> (PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)
>>>
>>> Thoughts?
>>
>> It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we're going to have such conditionals, GCC's behavior of allowing only certain enumerated values would seem to me to be better.
>>
>> However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)
>
>Sounds like fixing those is independent of the change you're
>proposing, though? I'm not sure I see it as related to this proposal.

It is independent.

>> And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings "ps4" or "gold", so there's no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it's rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not...
>
>Not sure I follow these pieces. Re: PS4, it's good to know it's not an
>issue/wouldn't be affected by this change. But that doesn't mean other
>users out there aren't benefiting from/relying on the ld. prefix -
>it's functionality LLVM has shipped, users can use, etc, and we would
>be changing that out from under them which I think is always something
>that should be considered carefully as to what we gain by changing
>user-facing behavior.

If they use a relative path, they can't be benefiting from ld. prefix

With the magic ld. prefix,

   -fuse-ld=path1/path2/foo -> ld.path1/path2/foo

OK, I'm confused - it seems we've gone back and forth on this, or at
least I have, in understanding what happens with relative paths.

You're saying, currently relative paths always get the prefix? Or does
clang test if "ld.path1/path2/foo" exists, and if it doesn't exist it
falls back to "path1/path2/foo"?

No fallback. A relative path does not really work. See below.

I'd say that's "fairly supported",
but not ideal - we could test if the path has any path separators in
it, then never use the prefix if it does. That way it doesn't
accidentally pick up a weird path if it does exist. Yes, this is a
break in functionality for some potential users - but I think the
possibility of a user intentionally using "ld.path1/path2/foo" is
small compared to the possibility of a user using "ld.foo"
intentionally (since that's already shown as intentional with bfd,
gold, and lld - so a user might see that and add their own in a
similar way).

-fuse-ld=/abs/path only stats the absolute path /abs/path

-fuse-ld=word stats x86_64-unknown-linux-gnu-ld.word , then
ld.word .

-fuse-ld=dir/path stats x86_64-unknown-linux-gnu-ld.dir/path

x86_64-unknown-linux-gnu-ld.dir is unlikely to be a directory.

If they use an absolute path, they are unaffected.

If they use a bare word, e.g. -fuse-ld=meow , they need to change it to
-fuse-ld=ld.meow now.

(1) GCC does not support arbitrary -fuse-ld=bare-word
(2) absolute/relative paths are not affected

^ this confuses me. You said above that " -fuse-ld=path1/path2/foo ->
ld.path1/path2/foo" is happening, so removing the prefix seems like it
would affect relative paths. But you're saying it won't?

An absolute path is unaffected by the patch.

A relative path does not reall work (diverged a lot from users' expection). Changing its behavior should not matter.

(3) bare word can easily be adapted

=> conclusion: if this (ever) affects any user, the troublesome should be small enough.

My concern would be mostly for users where it might silently pick a
different linker or get different linker semantics. Yes, that's
probably a fairly small set of users, if any. (a user with "ld.foo"
being a different version than "foo" would be surprising, and a user
with "foo" providing different semantics than "ld.foo" might be more
likely but still small (a busyboxed linker that doesn't default to the
semantics of the system they're on - eg: imagine if "lld2" did COFF By
default, but did ELF when using "ld.lld2" - changing "-fuse-ld=lld2"
from running "ld.lld2" to running COFF lld would be pretty
surprising/confusing/etc)) But, yes, if you got the wrong linker
semantics, it'd probably be broken enough you wouldn't get far/would
have to go figure out what went wrong and add the "ld." to your
-fuse-ld command.

Getting different linker semantics will fail immediately. The magic is
incorrect -> the linker should immediately report an error.
It is difficult to cause the linker to silently accept invalid input.

The RFC serves the purpose seeking for any such (proprietary linker)
user. Such objection hasn't emerged yet. If they don't speak up, I am
not sure we should wait indefinitely making -fuse-ld= sane and
consistent.

I don't think it's not sane, really.

And there are many more LLVM users than read these mailing lists,
unfortunately - some amount of "this feature is in the wild and may be
used by any number of users in interesting ways" is something we
should consider both when creating features and changing them.

- David

I think I have done anything I can to make the impact as small as
possible. We can't indefinitely support hypothetical use cases because
we simply can't.

Last but not the least, ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values does not deprive
their rights to contribute their own -fuse-ld=word . They can still make
the change

http://clang.llvm.org/get_involved.html

>>
>> >>
>> >>
>> >>
>> >>>
>> >>> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics of -fuse-ld=
>> >>> Some context (from my archaeology) about the option
>> >>>
>> >>> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>> >>> * rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
>> >>> * D78290 changed the prefix to ld64. on Darwin.
>> >>> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
>> >>> behavior change here will not be incompatible with GCC).
>> >>>
>> >>> Our existing behavior is cumbersome:
>> >>>
>> >>> * Inconsistency between an absolute
>> >>> path (which does not get a ld. prefix) and a relative path (which gets
>> >>> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
>> >>> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>> >>> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>> >>> * lld's new Mach-O port needs to pick an executable name. It would be
>> >>> nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by `ld64.`
>> >>>
>> >>> I suggest we just hard code the currently used values which intend to get
>> >>> a prefix: bfd, gold, lld. For all other values, don't add a prefix.
>> >>> (PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)
>> >>>
>> >>> Thoughts?
>> >>
>> >>
>> >> It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we're going to have such conditionals, GCC's behavior of allowing only certain enumerated values would seem to me to be better.
>> >>
>> >> However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)
>> >
>> >Sounds like fixing those is independent of the change you're
>> >proposing, though? I'm not sure I see it as related to this proposal.
>>
>> It is independent.
>>
>> >> And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings "ps4" or "gold", so there's no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it's rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not...
>> >
>> >Not sure I follow these pieces. Re: PS4, it's good to know it's not an
>> >issue/wouldn't be affected by this change. But that doesn't mean other
>> >users out there aren't benefiting from/relying on the ld. prefix -
>> >it's functionality LLVM has shipped, users can use, etc, and we would
>> >be changing that out from under them which I think is always something
>> >that should be considered carefully as to what we gain by changing
>> >user-facing behavior.
>>
>> If they use a relative path, they can't be benefiting from ld. prefix
>>
>> With the magic ld. prefix,
>>
>> -fuse-ld=path1/path2/foo -> ld.path1/path2/foo
>
>OK, I'm confused - it seems we've gone back and forth on this, or at
>least I have, in understanding what happens with relative paths.
>
>You're saying, currently relative paths always get the prefix? Or does
>clang test if "ld.path1/path2/foo" exists, and if it doesn't exist it
>falls back to "path1/path2/foo"?

No fallback. A relative path does not really work. See below.

Ah, OK - thanks for the clarification.

>I'd say that's "fairly supported",
>but not ideal - we could test if the path has any path separators in
>it, then never use the prefix if it does. That way it doesn't
>accidentally pick up a weird path if it does exist. Yes, this is a
>break in functionality for some potential users - but I think the
>possibility of a user intentionally using "ld.path1/path2/foo" is
>small compared to the possibility of a user using "ld.foo"
>intentionally (since that's already shown as intentional with bfd,
>gold, and lld - so a user might see that and add their own in a
>similar way).

-fuse-ld=/abs/path only stats the absolute path /abs/path

-fuse-ld=word stats x86_64-unknown-linux-gnu-ld.word , then
ld.word .

-fuse-ld=dir/path stats x86_64-unknown-linux-gnu-ld.dir/path

x86_64-unknown-linux-gnu-ld.dir is unlikely to be a directory.

More interested in which binary it runs - but I guess when you say
"stats" you mean "Checks if it exists, and if it does, runs it as the
linker"?

>> If they use an absolute path, they are unaffected.
>>
>> If they use a bare word, e.g. -fuse-ld=meow , they need to change it to
>> -fuse-ld=ld.meow now.
>>
>> (1) GCC does not support arbitrary -fuse-ld=bare-word
>> (2) absolute/relative paths are not affected
>
>^ this confuses me. You said above that " -fuse-ld=path1/path2/foo ->
>ld.path1/path2/foo" is happening, so removing the prefix seems like it
>would affect relative paths. But you're saying it won't?

An absolute path is unaffected by the patch.

A relative path does not reall work (diverged a lot from users' expection). Changing its behavior should not matter.

>> (3) bare word can easily be adapted
>>
>> => conclusion: if this (ever) affects any user, the troublesome should be small enough.
>
>My concern would be mostly for users where it might silently pick a
>different linker or get different linker semantics. Yes, that's
>probably a fairly small set of users, if any. (a user with "ld.foo"
>being a different version than "foo" would be surprising, and a user
>with "foo" providing different semantics than "ld.foo" might be more
>likely but still small (a busyboxed linker that doesn't default to the
>semantics of the system they're on - eg: imagine if "lld2" did COFF By
>default, but did ELF when using "ld.lld2" - changing "-fuse-ld=lld2"
>from running "ld.lld2" to running COFF lld would be pretty
>surprising/confusing/etc)) But, yes, if you got the wrong linker
>semantics, it'd probably be broken enough you wouldn't get far/would
>have to go figure out what went wrong and add the "ld." to your
>-fuse-ld command.

Getting different linker semantics will fail immediately. The magic is
incorrect -> the linker should immediately report an error.
It is difficult to cause the linker to silently accept invalid input.

Fair point - though still seems like a somewhat confusing thing for
users, but probably a fairly small slice of users.

>> The RFC serves the purpose seeking for any such (proprietary linker)
>> user. Such objection hasn't emerged yet. If they don't speak up, I am
>> not sure we should wait indefinitely making -fuse-ld= sane and
>> consistent.
>
>I don't think it's not sane, really.
>
>And there are many more LLVM users than read these mailing lists,
>unfortunately - some amount of "this feature is in the wild and may be
>used by any number of users in interesting ways" is something we
>should consider both when creating features and changing them.
>
>- David

I think I have done anything I can to make the impact as small as
possible. We can't indefinitely support hypothetical use cases because
we simply can't.

This feature doesn't seem particularly costly to support, though?

If the original goal was to support relative paths - I think a
narrower fix of testing whether the path contains path separators &
only adding the prefix if it doesn't have path separators would be
workable. (but, sure, possibly not worth bothering with)
If one of the goals is to get consistency with GCC and GCC does not
want to implement the generalized prefix support (was that proposed to
and rejected by GCC?), yeah - I'd be OK with breaking the
small/hypothetical use case to gain compatibility with GCC, for
instance.

Perhaps you could summarize the motivations/goals/path/tradeoffs?
(where'd you start, what proposals did you make to what communities &
how did that inform/change the direction, if at all, etc)

Last but not the least, ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values does not deprive
their rights to contribute their own -fuse-ld=word . They can still make
the change

Clang - Get Involved

My concern would be that would be another potential break/change for
users that could make things more confusing. It'd be good to make
choices we can be fairly committed to going forward. (which, I don't
think what your proposing is impossible to commit to - but I have some
doubts/want to check what the alternatives might be)

This depends on how they specify the path. If they specify a path for the linker via -fuse-ld, they don't get the prefix. Most cross-compile toolchains that I've seen don't do that. Instead, they use -B to specify a path to find *all* of the tools and then use -fuse-ld={lld,gold,bfd,proprietary linker's name}.

That is the use case I'm concerned about breaking. The behaviour of -B is to prepend a search path, but still fall back to the default paths if a tool is not found there. For example, if I have:

/usr/bin/lld
/my/custom/sdk/bin/ld.lld

If I run:

$ clang -B /my/custom/sdk/bin/ -fuse-ld=lld ...

Today, I will run my cross-compile toolchain's linker. With your proposed change, I will run my system LLD install. As a user, I will see very confusing error messages from the system linker if it can't handle my cross-compile target. Worse, I may see a binary linked with a linker that defaults to a different linker script, but still produces some output.

David

This is a good point. The change should take -B into account I think.

-eric

-fuse-ld={bfd,gold,lld} is compatible with the previous behavior.

Do you propose that:

* -fuse-ld={bfd,gold.lld} should find "ld.{bfd,gold,lld}" in COMPILER_PATH
* all -fuse-ld=word should find "word" in COMPILER_PATH
* -fuse-ld=relative/path should find "relative/path" in PATH?

In https://reviews.llvm.org/D80225 I intended to update the semantics of -fuse-ld=
Some context (from my archaeology) about the option

  • https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55470 added support for -fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
  • rL194328 ported the feature and made -fuse-ld= available for other values (with the ld. prefix)
  • D78290 changed the prefix to ld64. on Darwin.
  • GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld is allowed (so our
    behavior change here will not be incompatible with GCC).

Our existing behavior is cumbersome:

  • Inconsistency between an absolute
    path (which does not get a ld. prefix) and a relative path (which gets
    a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries to
    access x86_64-unknown-linux-gnu-ld.dir/foo (if
    LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
  • lld’s new Mach-O port needs to pick an executable name. It would be
    nice if -fuse-ld= simply accepts a program name, not necessarily prefixed by ld64.

I suggest we just hard code the currently used values which intend to get
a prefix: bfd, gold, lld. For all other values, don’t add a prefix.
(PS4 seems to use -fuse-ld=ps4 but it overrides ToolChain::GetLinkerPath, thus unaffected)

Thoughts?

It concerns me that we currently claim to accept arbitrary file paths for -fuse-ld=, yet, we sometimes hardcode special behavior for certain argument values. (Special behavior beyond simply changing which executable name to search for, that is). If we’re going to have such conditionals, GCC’s behavior of allowing only certain enumerated values would seem to me to be better.

However, currently, all of the cases in llvm appear to be either related to Windows toolchains or PS4 toolchain. E.g. here https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890 and https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424 (just search for OPT_fuse_ld_EQ to find others.)

Sounds like fixing those is independent of the change you’re
proposing, though? I’m not sure I see it as related to this proposal.

Yes, this is somewhat independent. I bring it up because this proposal is to change the existing behavior of -fuse-ld=word, in order to make it more consistent with -fuse-ld=path. And also there was a suggestion that GCC ought to adopt our behavior.

But, if there’s a requirement to understand which kind of linker the driver is invoking – which apparently there is – this seems a wrong direction to go in. And maybe we should instead be deprecating -fuse-ld=PATH. It doesn’t seem clear to me that there is really a fix for this. If the driver needs to know what kind of linker it’s invoking, then I don’t see how it can ever work properly to allow arbitrary values.

And the PS4 toolchain driver code already prohibits any -fuse-ld= other than the exact strings “ps4” or “gold”, so there’s no weird behavior changes, only a clear (yet perhaps unexpected) error message. So, maybe we should do the same for windows? But it’s rather unfortunate for clang to inconsistently sometimes allow paths as the argument to -fuse-ld= and sometimes not…

Not sure I follow these pieces. Re: PS4, it’s good to know it’s not an
issue/wouldn’t be affected by this change. But that doesn’t mean other
users out there aren’t benefiting from/relying on the ld. prefix -
it’s functionality LLVM has shipped, users can use, etc, and we would
be changing that out from under them which I think is always something
that should be considered carefully as to what we gain by changing
user-facing behavior.

Windows: Not sure what part of this discussion would motivate changing
windows to disallow the arbitrary linker path compared to allowing it
on Linux.

The code in LLVM for Windows and PS4 currently needs to know what linker it’s invoking, for correct behavior. Thus, users who specify -fuse-ld=PATH will get incorrect behavior today, even if PATH represents the same binary as -fuse-ld=lld would choose.

>
>>
>> In ⚙ D80225 [Driver] Recognize -fuse-ld={bfd,gold,lld} but don't prepend "ld." or "ld64." for other values I intended to update the semantics
of -fuse-ld=
>> Some context (from my archaeology) about the option
>>
>> * 55470 – Support -fuse-ld=bfd and -fuse-ld=gold added support for
-fuse-ld=bfd to mean ld.bfd and -fuse-ld=gold to mean ld.gold
>> * rL194328 ported the feature and made -fuse-ld= available for other
values (with the ld. prefix)
>> * D78290 changed the prefix to ld64. on Darwin.
>> * GCC added support for -fuse-ld=lld. No other value than bfd,gold,lld
is allowed (so our
>> behavior change here will not be incompatible with GCC).
>>
>> Our existing behavior is cumbersome:
>>
>> * Inconsistency between an absolute
>> path (which does not get a ld. prefix) and a relative path (which
gets
>> a ld. prefix) For a relative path, -fuse-ld=dir/foo currently tries
to
>> access x86_64-unknown-linux-gnu-ld.dir/foo (if
>> LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu).
>> * lld's new Mach-O port needs to pick an executable name. It would be
>> nice if -fuse-ld= simply accepts a program name, not necessarily
prefixed by `ld64.`
>>
>> I suggest we just hard code the currently used values which intend to
get
>> a prefix: bfd, gold, lld. For all other values, don't add a prefix.
>> (PS4 seems to use -fuse-ld=ps4 but it overrides
ToolChain::GetLinkerPath, thus unaffected)
>>
>> Thoughts?
>
> It concerns me that we currently claim to accept arbitrary file paths
for -fuse-ld=, yet, we sometimes hardcode special behavior for certain
argument values. (Special behavior beyond simply changing which executable
name to search for, that is). If we're going to have such conditionals,
GCC's behavior of allowing only certain enumerated values would seem to me
to be better.
>
> However, currently, all of the cases in llvm appear to be either related
to Windows toolchains or PS4 toolchain. E.g. here
https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/Driver.cpp#L4890
and
https://github.com/llvm/llvm-project/blob/3bb0d95fdc2fffd193d39d14f2ef421d4b468617/clang/lib/Driver/ToolChains/MinGW.cpp#L424
(just search for OPT_fuse_ld_EQ to find others.)

Sounds like fixing those is independent of the change you're
proposing, though? I'm not sure I see it as related to this proposal.

Yes, this is somewhat independent. I bring it up because this proposal is
to change the existing behavior of -fuse-ld=word, in order to make it more
consistent with -fuse-ld=path. And also there was a suggestion that GCC
ought to adopt our behavior.

But, if there's a requirement to understand which kind of linker the driver
is invoking -- which apparently there is -- this seems a wrong direction to
go in. And maybe we should instead be deprecating -fuse-ld=PATH. It doesn't
seem clear to me that there *is* really a fix for this. If the driver needs
to know what kind of linker it's invoking, then I don't see how it can ever
work properly to allow arbitrary values.

-fuse-ld= is currently overridden to mean both "linker flavor" and "linker path".

How about this?

* -fuse-ld=bfd => find ld.bfd in COMPILER_PATH
* -fuse-ld=gold => find ld.gold in COMPILER_PATH
* -fuse-ld=lld => find ld.lld in COMPILER_PATH
* -fuse-ld=other => warning: -fuse-ld=.... is deprecated

The need of specifying a relative/absolute path exists (recent request
⚙ D80660 clang: Add support for relative linker paths with -fuse-ld ), so we need to add an alternative option, for example

* -fld-path=word # find word in PATH
* -fld-path=relative/path # $PWD/relative/path
* -fld-path=/absolute/path # /absolute/path