LLD: Can we make --warn-backrefs the default?

Hi all, LLD's --warn-backrefs is a tool to identify potential
incompatible archive selection semantics with traditional Unix linkers.
I have improved it (via D77522,D77630 and D77512) to a state where a
--warn-backrefs diagnostic almost assuredly means that the link will
fail with GNU ld, or the symbol will get different resolution in GNU ld and LLD.

My conclusion is that --warn-backrefs is a very useful layering check tool.
I just wrote a documentation about the advantage (of GNU ld's archive
selection semantics..... But we can do better with --warn-backrefs!
GNU ld just reports "undefined reference" with no actionable feedback
about the offending archive)

https://reviews.llvm.org/D86762

I am wondering whether in the next release we can make --warn-backrefs
the default. I have added many known users to the review.
(There is no need for --no-warn-backrefs because --warn-backrefs-exclude='*' does the same job)

Would you like to conduct the conversation here, or on the review thread? (I lean towards having them here, but don’t mind if folks feel like it keeps the noise down & want to more post a notice saying “hey, here’s this thing, if you’re interested, go discuss it over there” - more an optional opt-in rather than requiring people to opt-out via muting the thread, etc)

Would you like to conduct the conversation here, or on the review thread? (I lean towards having them here, but don't mind if folks feel like it keeps the noise down & want to more post a notice saying "hey, here's this thing, if you're interested, go discuss it over there" - more an optional opt-in rather than requiring people to opt-out via muting the thread, etc)

Yes, we can conduct the "should we enable --warn-backrefs by default"
conversation here. Since the semantics --warn-backrefs of are a bit
complex, we need a documentation. ⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection) is
put up to get wording suggestions. Explicitly adding the people to the
CC list...

FWIW for many code bases, --warn-backrefs should produce no warnings
(error if --fatal-warnings). For some code bases, GNU ld may error
"undefined reference". --warn-backrefs can catch such problems.

Would you like to conduct the conversation here, or on the review thread? (I lean towards having them here, but don’t mind if folks feel like it keeps the noise down & want to more post a notice saying “hey, here’s this thing, if you’re interested, go discuss it over there” - more an optional opt-in rather than requiring people to opt-out via muting the thread, etc)

Yes, we can conduct the “should we enable --warn-backrefs by default”
conversation here. Since the semantics --warn-backrefs of are a bit
complex, we need a documentation. https://reviews.llvm.org/D86762 is
put up to get wording suggestions. Explicitly adding the people to the
CC list…

FWIW for many code bases, --warn-backrefs should produce no warnings
(error if --fatal-warnings). For some code bases, GNU ld may error
“undefined reference”. --warn-backrefs can catch such problems.

One of the questions raised on the thread there was about different linker semantics. I assume the “–warn-backrefs by default” we’re discussing is only related to the ld.lld frontend? Not the Windows linker lld behavior (or ld64 (old or new) lld behavior)?

>
> Would you like to conduct the conversation here, or on the review thread? (I lean towards having them here, but don't mind if folks feel like it keeps the noise down & want to more post a notice saying "hey, here's this thing, if you're interested, go discuss it over there" - more an optional opt-in rather than requiring people to opt-out via muting the thread, etc)

Yes, we can conduct the "should we enable --warn-backrefs by default"
conversation here. Since the semantics --warn-backrefs of are a bit
complex, we need a documentation. ⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection) is
put up to get wording suggestions. Explicitly adding the people to the
CC list...

FWIW for many code bases, --warn-backrefs should produce no warnings
(error if --fatal-warnings). For some code bases, GNU ld may error
"undefined reference". --warn-backrefs can catch such problems.

One of the questions raised on the thread there was about different linker semantics. I assume the "--warn-backrefs by default" we're discussing is only related to the ld.lld frontend? Not the Windows linker lld behavior (or ld64 (old or new) lld behavior)?

Yes, the ELF port (lld/ELF). No other port has implemented
--warn-backrefs. The intion of --warn-backrefs is to capture behavior
differences with GNU ld's ELF ports. If traditional ELF linkers don't
behave like that, I can replace "traditional ELF linkers" in
⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection) with "GNU linkers".

I see the GNU ld behavior as a limitation, not as a feature, as Peter Smith also pointed out in https://reviews.llvm.org/D86762. While it can be argued that there are certain cases where it can help detect layering violations as you mentioned in your change, I’m not sure how valuable that is in practice. Every case I’ve encountered so far either in Chrome or in Fuchsia was a valid use case, most commonly interceptors. The solution has always been the same, wrap all libraries in --start-group/–stop-group and it’s what most projects do by default to avoid dealing with these issues, see for example Chromium. In our case, compatibility with linkers on other platforms is more important than compatibility with GNU ld, so I’d prefer to keep the current behavior. Projects that care about compatibility with GNU ld can use --warn-backrefs.

I see the GNU ld behavior as a limitation, not as a feature, as Peter Smith
also pointed out in ⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection). While it can be argued
that there are certain cases where it can help detect layering
violations as you mentioned in your change, I'm not sure how valuable that
is in practice. Every case I've encountered so far either in Chrome or in
Fuchsia was a valid use case, most commonly interceptors. The solution
has always been the same, wrap all libraries in --start-group/--stop-group
and it's what most projects do by default to avoid dealing with these
issues, see for example [Chromium](
https://source.chromium.org/chromium/chromium/src/+/master:build/toolchain/gcc_toolchain.gni;l=409).
In our case, compatibility with linkers on other platforms is more
important than compatibility with GNU ld, so I'd prefer to keep the current
behavior. Projects that care about compatibility with GNU ld can use
--warn-backrefs.

I totally understand that some users may not want to deal with GNU ld
compatibility:) I'll then question about Chromium's addition of -z defs:
Issue 843583006: GN: Add -Wl,-z defs on linux and fix errors - Code Review :slight_smile:

-z defs is like a layering checking tool for shared objects while
--warn-backrefs is for archives. For performance, ABI concerns and ease
of deployment, many projects tend to build their own components as
archives instead of shared objects. In this sense --warn-backrefs will
probably be more useful than -z defs.

(
TIL lorder and tsort were created to define an order of archives in
early versions of Unix. tsort background (GNU Coreutils 9.0)
It seems that the article missed the point that proper library layering is still useful
)

I see the GNU ld behavior as a limitation, not as a feature, as Peter Smith
also pointed out in https://reviews.llvm.org/D86762. While it can be argued
that there are certain cases where it can help detect layering
violations as you mentioned in your change, I’m not sure how valuable that
is in practice. Every case I’ve encountered so far either in Chrome or in
Fuchsia was a valid use case, most commonly interceptors. The solution
has always been the same, wrap all libraries in --start-group/–stop-group
and it’s what most projects do by default to avoid dealing with these
issues, see for example Chromium.
In our case, compatibility with linkers on other platforms is more
important than compatibility with GNU ld, so I’d prefer to keep the current
behavior. Projects that care about compatibility with GNU ld can use
–warn-backrefs.

I totally understand that some users may not want to deal with GNU ld
compatibility:) I’ll then question about Chromium’s addition of -z defs:
https://crrev.com/843583006 :slight_smile:

-z defs is like a layering checking tool for shared objects while
–warn-backrefs is for archives. For performance, ABI concerns and ease
of deployment, many projects tend to build their own components as
archives instead of shared objects. In this sense --warn-backrefs will
probably be more useful than -z defs.

(
TIL lorder and tsort were created to define an order of archives in
early versions of Unix. https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html
It seems that the article missed the point that proper library layering is still useful
)

I’m not a fan of this idea of reframing GNU ld behavior as a “layering checking tool”. It is an incomplete layering checking tool because it does not detect the scenario where, for example, you have the intended dependency graph:

A → B
A → C
B → D
C → D

(resulting in -la -lb -lc -ld) and you have an unexpected dependency B → C. There is already a way to detect layering problems, that detects practical layering problems and not just theoretical ones, which is to link programs that use subsets of the libraries. For example, linking a program that depends only on B would result in detecting the invalid B → C dependency. It’s also worth noting that even that would only detect the layering problem if the program depends on the part of B that depends on C.

A better way to go about achieving layering checking would IMHO be to implement a separate tool (not part of the linker) that is capable of a complete layering check. Such a tool would only depend on symbol table features common to all object formats, so it could probably be implemented generically.

Peter

>I see the GNU ld behavior as a limitation, not as a feature, as Peter
Smith
>also pointed out in ⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection). While it can be
argued
>that there are certain cases where it can help detect layering
>violations as you mentioned in your change, I'm not sure how valuable that
>is in practice. Every case I've encountered so far either in Chrome or in
>Fuchsia was a valid use case, most commonly interceptors. The solution
>has always been the same, wrap all libraries in --start-group/--stop-group
>and it's what most projects do by default to avoid dealing with these
>issues, see for example [Chromium](
>
https://source.chromium.org/chromium/chromium/src/+/master:build/toolchain/gcc_toolchain.gni;l=409
).
>In our case, compatibility with linkers on other platforms is more
>important than compatibility with GNU ld, so I'd prefer to keep the
current
>behavior. Projects that care about compatibility with GNU ld can use
>--warn-backrefs.

I totally understand that some users may not want to deal with GNU ld
compatibility:) I'll then question about Chromium's addition of -z defs:
Issue 843583006: GN: Add -Wl,-z defs on linux and fix errors - Code Review :slight_smile:

-z defs is like a layering checking tool for shared objects while
--warn-backrefs is for archives. For performance, ABI concerns and ease
of deployment, many projects tend to build their own components as
archives instead of shared objects. In this sense --warn-backrefs will
probably be more useful than -z defs.

(
TIL lorder and tsort were created to define an order of archives in
early versions of Unix.
tsort background (GNU Coreutils 9.0)
It seems that the article missed the point that proper library layering is
still useful
)

I'm not a fan of this idea of reframing GNU ld behavior as a "layering
checking tool". It is an incomplete layering checking tool because it does
not detect the scenario where, for example, you have the intended
dependency graph:

A -> B
A -> C
B -> D
C -> D

(resulting in -la -lb -lc -ld) and you have an unexpected dependency B ->
C.

Yes, the GNU ld layering checking behavior is incomplete (yet important
and sufficient if we aim for compatibility).

The build system pick two orders:
-la -lb -lc -ld
-la -lc -lb -ld

There is already a way to detect layering problems, that detects
practical layering problems and not just theoretical ones, which is to link
programs that use subsets of the libraries. For example, linking a program
that depends only on B would result in detecting the invalid B -> C
dependency.

This is actually cumbersome and is explicitly described in ⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection)

If B -> C is not specified,

* If people write B_test ("linking a program that depends only on B"),
   they will notice the dependency issue immediately via the "undefined symbol" diagnostic.
* If such a B_test does not exist. The user may work on a large
   application which depends on B (and transitively on D) but not on A.
   OK, they will get an undefined symbol from C. However, it is not
   appropriate for them to add a dependency on C because they don't use C
   directly. See the "If adding the dependency does not form a cycle:"
   paragraph in D86762.

   If their application actually depends on A (thus they will get the
   dependency on C), their link may or may not succeed with GNU ld,
   depending on whether the build system picks
   -la -lb -lc -ld or -la -lc -lb -ld, and the diagnostic can be very
   confusing "undefined reference to" while C is actually linked.

It's also worth noting that even that would only detect the
layering problem if the program depends on the part of B that depends on C.

A better way to go about achieving layering checking would IMHO be to
implement a separate tool (not part of the linker) that is capable of
a complete layering check. Such a tool would only depend on symbol table
features common to all object formats, so it could probably be implemented
generically.

Peter

A standalone tool will not achieve sufficient ergonomics.
It will read all input files and duplicate the symbol resolution work done by the linker,
which can be slow. (ld.lld --warn-backrefs only imposes negligible
overhead when a lazy object is fetched.)

Supplement: say the application is E, we have the following dependency
set and the link (lD -lB -lD) fail due to an undefined symbol
defined by C (because of the unspecified dependency from B to C):

E -> B
B -> D
C -> D

After adding E -> C, there is chance that --warn-backrefs will still
warn, because of a backward reference from B to C (-lD -lC -lB -lD). The
order topological order (-lD -lB -lC -lD) is good and accepted by GNU ld.
In the warning case, it will give the user actionable feedback on adding
the dependency at the appropriate level (B -> C).
My idea is that this is good enough and works well in practice
(compatible with GNU linkers).

I see the GNU ld behavior as a limitation, not as a feature, as Peter
Smith
also pointed out in https://reviews.llvm.org/D86762. While it can be
argued
that there are certain cases where it can help detect layering
violations as you mentioned in your change, I’m not sure how valuable that
is in practice. Every case I’ve encountered so far either in Chrome or in
Fuchsia was a valid use case, most commonly interceptors. The solution
has always been the same, wrap all libraries in --start-group/–stop-group
and it’s what most projects do by default to avoid dealing with these
issues, see for example [Chromium](

https://source.chromium.org/chromium/chromium/src/+/master:build/toolchain/gcc_toolchain.gni;l=409
).

In our case, compatibility with linkers on other platforms is more
important than compatibility with GNU ld, so I’d prefer to keep the
current
behavior. Projects that care about compatibility with GNU ld can use
–warn-backrefs.

I totally understand that some users may not want to deal with GNU ld
compatibility:) I’ll then question about Chromium’s addition of -z defs:
https://crrev.com/843583006 :slight_smile:

-z defs is like a layering checking tool for shared objects while
–warn-backrefs is for archives. For performance, ABI concerns and ease
of deployment, many projects tend to build their own components as
archives instead of shared objects. In this sense --warn-backrefs will
probably be more useful than -z defs.

(
TIL lorder and tsort were created to define an order of archives in
early versions of Unix.
https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html
It seems that the article missed the point that proper library layering is
still useful
)

I’m not a fan of this idea of reframing GNU ld behavior as a “layering
checking tool”. It is an incomplete layering checking tool because it does
not detect the scenario where, for example, you have the intended
dependency graph:

A → B
A → C
B → D
C → D

(resulting in -la -lb -lc -ld) and you have an unexpected dependency B →
C.

Yes, the GNU ld layering checking behavior is incomplete (yet important
and sufficient if we aim for compatibility).

As Petr mentioned, only certain users care about this aspect of GNU ld compatibility, and those users can just turn this feature on.

The build system pick two orders:
-la -lb -lc -ld
-la -lc -lb -ld

Unless you have multiple programs linking against A, you’ve just introduced non-determinism in your build tool, which is generally considered to be a Bad Thing.

There is already a way to detect layering problems, that detects
practical layering problems and not just theoretical ones, which is to link
programs that use subsets of the libraries. For example, linking a program
that depends only on B would result in detecting the invalid B → C
dependency.

This is actually cumbersome and is explicitly described in https://reviews.llvm.org/D86762

Right, and as I mention below even that doesn’t catch all cases. My point is that there is already a way to detect “practical” layering problems, defined as “layering problems that cause an undefined symbol error in programs that are linked at the same time as the library is built”.

  • Users who don’t care about GNU ld can link their libraries normally.
  • Users who care about GNU ld can pass --warn-backrefs.

Users who care more about “theoretical” layering problems, such as users who ship prebuilt archive files to customers, will not be satisfied by --warn-backrefs, as this will not catch every possible layering problem before shipping the library, as they will not have a copy of every customer’s program. Instead, they will be better served by a separate tool.

If B → C is not specified,

  • If people write B_test (“linking a program that depends only on B”),
    they will notice the dependency issue immediately via the “undefined symbol” diagnostic.
  • If such a B_test does not exist. The user may work on a large
    application which depends on B (and transitively on D) but not on A.
    OK, they will get an undefined symbol from C. However, it is not
    appropriate for them to add a dependency on C because they don’t use C
    directly. See the “If adding the dependency does not form a cycle:”
    paragraph in D86762.

If their application actually depends on A (thus they will get the
dependency on C), their link may or may not succeed with GNU ld,
depending on whether the build system picks
-la -lb -lc -ld or -la -lc -lb -ld, and the diagnostic can be very
confusing “undefined reference to” while C is actually linked.

It’s also worth noting that even that would only detect the
layering problem if the program depends on the part of B that depends on C.

A better way to go about achieving layering checking would IMHO be to
implement a separate tool (not part of the linker) that is capable of
a complete layering check. Such a tool would only depend on symbol table
features common to all object formats, so it could probably be implemented
generically.

Peter

A standalone tool will not achieve sufficient ergonomics.
It will read all input files and duplicate the symbol resolution work done by the linker,
which can be slow. (ld.lld --warn-backrefs only imposes negligible
overhead when a lazy object is fetched.)

As I mentioned, it would do additional work that the linker is not currently doing, which is necessary to implement a complete layering checking tool, such as reading all archive members out of each archive (whereas linkers only need to read depended-on archive members), and which linkers should not be doing by default exactly for performance reasons. Furthermore, the additional tool can be moved out of the critical edit-build-run path, unlike a linker based tool, which should improve performance as well.

An additional tool would give the flexibility of allowing the interface to specify the actual dependencies instead of just giving us only what we can achieve as a result of historical design decisions. It would free build tools like cmake or gn from needing to topologically sort libraries in order to implement layering checking; instead, they can simply dump their dependency graph.

IMHO, selling this feature as a layering checker is worse than having no layering checker at all, because it will mislead users into thinking “oh, I’m using lld’s layering checker, that must mean that my program is properly layered”.

Peter

>
>> >I see the GNU ld behavior as a limitation, not as a feature, as Peter
>> Smith
>> >also pointed out in ⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection). While it can be
>> argued
>> >that there are certain cases where it can help detect layering
>> >violations as you mentioned in your change, I'm not sure how valuable
that
>> >is in practice. Every case I've encountered so far either in Chrome or
in
>> >Fuchsia was a valid use case, most commonly interceptors. The solution
>> >has always been the same, wrap all libraries in
--start-group/--stop-group
>> >and it's what most projects do by default to avoid dealing with these
>> >issues, see for example [Chromium](
>> >
>>
https://source.chromium.org/chromium/chromium/src/+/master:build/toolchain/gcc_toolchain.gni;l=409
>> ).
>> >In our case, compatibility with linkers on other platforms is more
>> >important than compatibility with GNU ld, so I'd prefer to keep the
>> current
>> >behavior. Projects that care about compatibility with GNU ld can use
>> >--warn-backrefs.
>>
>> I totally understand that some users may not want to deal with GNU ld
>> compatibility:) I'll then question about Chromium's addition of -z defs:
>> Issue 843583006: GN: Add -Wl,-z defs on linux and fix errors - Code Review :slight_smile:
>>
>> -z defs is like a layering checking tool for shared objects while
>> --warn-backrefs is for archives. For performance, ABI concerns and ease
>> of deployment, many projects tend to build their own components as
>> archives instead of shared objects. In this sense --warn-backrefs will
>> probably be more useful than -z defs.
>>
>> (
>> TIL lorder and tsort were created to define an order of archives in
>> early versions of Unix.
>>
tsort background (GNU Coreutils 9.0)
>> It seems that the article missed the point that proper library layering
is
>> still useful
>> )
>>
>
>I'm not a fan of this idea of reframing GNU ld behavior as a "layering
>checking tool". It is an incomplete layering checking tool because it does
>not detect the scenario where, for example, you have the intended
>dependency graph:
>
>A -> B
>A -> C
>B -> D
>C -> D
>
>(resulting in -la -lb -lc -ld) and you have an unexpected dependency B ->
>C.

Yes, the GNU ld layering checking behavior is incomplete (yet important
and sufficient if we aim for compatibility).

As Petr mentioned, only certain users care about this aspect of GNU ld
compatibility, and those users can just turn this feature on.

OK, I will consider you and Petr are two users not caring about this aspect of GNU ld.

If the other side turns out to be overwhelming: the users who don't want this
feature can turn it off (--no-warn-backrefs).

The build system pick two orders:

-la -lb -lc -ld
-la -lc -lb -ld

Unless you have multiple programs linking against A, you've just introduced
non-determinism in your build tool, which is generally considered to be a
Bad Thing.

>There is already a way to detect layering problems, that detects
>practical layering problems and not just theoretical ones, which is to
link
>programs that use subsets of the libraries. For example, linking a program
>that depends only on B would result in detecting the invalid B -> C
>dependency.

This is actually cumbersome and is explicitly described in
⚙ D86762 [ELF] Add documentation for --warn-backrefs: a GNU ld compatibility checking tool (and lesser of layering detection)

Right, and as I mention below even that doesn't catch all cases. My point
is that there is already a way to detect "practical" layering problems,
defined as "layering problems that cause an undefined symbol error in
programs that are linked at the same time as the library is built".

- Users who don't care about GNU ld can link their libraries normally.
- Users who care about GNU ld can pass --warn-backrefs.

Users who care more about "theoretical" layering problems, such as users
who ship prebuilt archive files to customers, will not be satisfied by
--warn-backrefs, as this will not catch every possible layering problem
before shipping the library, as they will not have a copy of every
customer's program. Instead, they will be better served by a separate tool.

If B -> C is not specified,

* If people write B_test ("linking a program that depends only on B"),
   they will notice the dependency issue immediately via the "undefined
symbol" diagnostic.
* If such a B_test does not exist. The user may work on a large
   application which depends on B (and transitively on D) but not on A.
   OK, they will get an undefined symbol from C. However, it is not
   appropriate for them to add a dependency on C because they don't use C
   directly. See the "If adding the dependency does not form a cycle:"
   paragraph in D86762.

   If their application actually depends on A (thus they will get the
   dependency on C), their link may or may not succeed with GNU ld,
   depending on whether the build system picks
   -la -lb -lc -ld or -la -lc -lb -ld, and the diagnostic can be very
   confusing "undefined reference to" while C is actually linked.

>It's also worth noting that even that would only detect the
>layering problem if the program depends on the part of B that depends on
C.

>A better way to go about achieving layering checking would IMHO be to
>implement a separate tool (not part of the linker) that is capable of
>a complete layering check. Such a tool would only depend on symbol table
>features common to all object formats, so it could probably be implemented
>generically.
>
>Peter

A standalone tool will not achieve sufficient ergonomics.
It will read all input files and duplicate the symbol resolution work done
by the linker,
which can be slow. (ld.lld --warn-backrefs only imposes negligible
overhead when a lazy object is fetched.)

As I mentioned, it would do additional work that the linker is not
currently doing, which is necessary to implement a complete layering
checking tool, such as reading all archive members out of each archive
(whereas linkers only need to read depended-on archive members), and which
linkers should not be doing by default exactly for performance reasons.
Furthermore, the additional tool can be moved out of the critical
edit-build-run path, unlike a linker based tool, which should improve
performance as well.

If the build system uses --start-lib more than archives, making the linker do
the additional check may have little additional overhead. Lazy object files
don't have the archive symbol table, which was much useful in the old days but
limited in today's viewpoint. LLD already has quite a bit logic iterating the
symbol table, doing one or two more things may not affect overall performance.

An additional tool would give the flexibility of allowing the interface to
specify the actual dependencies instead of just giving us only what we can
achieve as a result of historical design decisions. It would free build
tools like cmake or gn from needing to topologically sort libraries in
order to implement layering checking; instead, they can simply dump their
dependency graph.

IMHO, selling this feature as a layering checker is worse than having no
layering checker at all, because it will mislead users into thinking "oh,
I'm using lld's layering checker, that must mean that my program is
properly layered".

Peter

The option name is still "--warn-backrefs", not "--check-layering". The selling
point is more about the compatibility checker. While it does incomplete layering
checking, it is an almost full GNU ld compatibility checking tool. I have
personally checked dozens of executables with --warn-backrefs problems. Most
don't link with GNU ld and the few remaining cases are whether symbol resolution
will pick different definitions with LLD and GNU ld.

I wanted to chime in and say that I think we should keep the current default too, for three reasons:

  1. The current default is more user friendly. Users shouldn’t have to worry about if they pass -lpthread before or after their .o files (…or other libraries. I know I know for pthread it’s -pthread and the driver will figure it out). https://stackoverflow.com/questions/45135/why-does-the-order-in-which-libraries-are-linked-sometimes-cause-errors-in-gcc is a huge FAQ and a cause of confusion for beginners. LLD’s current defaults are much friendlier.

  2. The current default is more self-consistent: It’s consistent with ld64, link.exe, and hence lld-link and (new) lld.ld64.

  3. As Peter says correctly, using this for layering isn’t a great fit since it only checks a linearized topological sort of the build graph nodes. A good check needs to work on a higher abstraction level.

You seem to focus on the use case where compiler and linker commands are generated by some meta build system. I think humans and bash scripts also do a lot of building, and for that case ease-of-use matters much more. Experts who use meta build systems will have a much easier time opting in to this than new folks will have to opting out of it. (And even then, given 2, I think opting in to this is only attractive to non-win non-mac shops, and even then, given 3, I’m not sure it’s all that useful there either.)

Nico

I wanted to chime in and say that I think we should keep the current default too, for three reasons:

  1. The current default is more user friendly. Users shouldn’t have to worry about if they pass -lpthread before or after their .o files (…or other libraries. I know I know for pthread it’s -pthread and the driver will figure it out). https://stackoverflow.com/questions/45135/why-does-the-order-in-which-libraries-are-linked-sometimes-cause-errors-in-gcc is a huge FAQ and a cause of confusion for beginners. LLD’s current defaults are much friendlier.

  2. The current default is more self-consistent: It’s consistent with ld64, link.exe, and hence lld-link and (new) lld.ld64.

I think this misses the aspect of each of those linkers being consistent with the counterparts they’re intended to emulate, right?

  1. As Peter says correctly, using this for layering isn’t a great fit since it only checks a linearized topological sort of the build graph nodes. A good check needs to work on a higher abstraction level.

You seem to focus on the use case where compiler and linker commands are generated by some meta build system. I think humans and bash scripts also do a lot of building, and for that case ease-of-use matters much more. Experts who use meta build systems will have a much easier time opting in to this than new folks will have to opting out of it. (And even then, given 2, I think opting in to this is only attractive to non-win non-mac shops, and even then, given 3, I’m not sure it’s all that useful there either.)

Not sure there - projects like LLVM having lld provide bfd-ld-esque behavior gives a better chance the project won’t regress the ability to link with bfd or gold (be a bit unfortunate if LLVM could only be linked with lld, or putting the cost on developers using bfd/gold to cleanup regressions other folks introduce). At least open source projects tend not to (so far as I know - probably fairly limited guess, admittedly) be “can only compile with this compile and only link with this linker” - probably have somewhat broad compatibility goals. Admittedly, for LLVM, maybe we’d want to opt-in to this behavior if using lld on any platform, not just ld.lld.

I’m not massively invested in the default behavior one way or the other here - and can appreciate that users with no interest in linker portability with bfd.ld/gold might find the semantics to be an unnecessary annoyance. I’d have thought most projects - anymore than a scratch program to be thrown away, is probably intended to be compilable/linkable with platform defaults/not restricted to one linker.

  • Dave

I wanted to chime in and say that I think we should keep the current default too, for three reasons:

  1. The current default is more user friendly. Users shouldn’t have to worry about if they pass -lpthread before or after their .o files (…or other libraries. I know I know for pthread it’s -pthread and the driver will figure it out). https://stackoverflow.com/questions/45135/why-does-the-order-in-which-libraries-are-linked-sometimes-cause-errors-in-gcc is a huge FAQ and a cause of confusion for beginners. LLD’s current defaults are much friendlier.

  2. The current default is more self-consistent: It’s consistent with ld64, link.exe, and hence lld-link and (new) lld.ld64.

I think this misses the aspect of each of those linkers being consistent with the counterparts they’re intended to emulate, right?

  1. As Peter says correctly, using this for layering isn’t a great fit since it only checks a linearized topological sort of the build graph nodes. A good check needs to work on a higher abstraction level.

You seem to focus on the use case where compiler and linker commands are generated by some meta build system. I think humans and bash scripts also do a lot of building, and for that case ease-of-use matters much more. Experts who use meta build systems will have a much easier time opting in to this than new folks will have to opting out of it. (And even then, given 2, I think opting in to this is only attractive to non-win non-mac shops, and even then, given 3, I’m not sure it’s all that useful there either.)

Not sure there - projects like LLVM having lld provide bfd-ld-esque behavior gives a better chance the project won’t regress the ability to link with bfd or gold (be a bit unfortunate if LLVM could only be linked with lld, or putting the cost on developers using bfd/gold to cleanup regressions other folks introduce). At least open source projects tend not to (so far as I know - probably fairly limited guess, admittedly) be “can only compile with this compile and only link with this linker” - probably have somewhat broad compatibility goals. Admittedly, for LLVM, maybe we’d want to opt-in to this behavior if using lld on any platform, not just ld.lld.

I’m not massively invested in the default behavior one way or the other here - and can appreciate that users with no interest in linker portability with bfd.ld/gold might find the semantics to be an unnecessary annoyance. I’d have thought most projects - anymore than a scratch program to be thrown away, is probably intended to be compilable/linkable with platform defaults/not restricted to one linker.

I’ve not chimed in yet with any point of view on the overall discussion yet, because our team hasn’t had a chance to look at what the impact on us would be just yet. However, I did want to point out that many (most?) of those who use our downstream version of LLD probably don’t care about any platform other than ours and possibly one or two others, where GNU ld.bfd/gold compatibility isn’t a concern.

It looks like the conversation has died, but I just wanted to post my own investigation based on our internal code base. The code base itself is quite a sprawling mass, involving multiple different build systems, some bits CMake based, some hand-curated and so on, and I don’t fully comprehend it all. I do know that trying to change it is hard at best, and more likely impossible to do so safely. I ran the build using an LLD with the --warn-backrefs option always enabled and saw several dozen distinct instances of the warning being emitted. I can’t say how many instances of this warning are genuine library layering violations that can be easily fixed, but I can say that getting any changes to fix things (even if they are genuine and bad violations) will be non-trivial, and unnecessary turbulence in my opinion.

We could of course always have a private patch to disable the --warn-backrefs option, but this is not really ideal. I also have no reason to believe that our code base is the only one in this situation, so I think it would be better to not enable the option by default.

James

Does your codebase build correctly with gold or bfd linkers?

By which I mean to distinguish:

  • is the warning triggering on something which is empirically not a problem, or
  • is it telling you about a problem which you happen to not care about, because you only support linking with lld.

As mentioned previously, our code base is only intended to work with our downstream LLD port. It won’t work with other linkers because they don’t contain the private patches required to link our code. I haven’t got a way of testing with another linker (BFD or gold) because that simply won’t work, unrelated to the backrefs issue. I suspect, but have no way to easily verify, that the warning is telling me about both varieties in different places.

As mentioned previously, our code base is only intended to work with our downstream LLD port. It won't work with other linkers because they don't contain the private patches required to link our code. I haven't got a way of testing with another linker (BFD or gold) because that simply won't work, unrelated to the backrefs issue. I suspect, but have no way to easily verify, that the warning is telling me about both varieties in different places.

Does your codebase build correctly with gold or bfd linkers?

By which I mean to distinguish:
- is the warning triggering on something which is empirically not a problem, or
- is it telling you about a problem which you happen to not care about, because you only support linking with lld.

I have identified/fixed more than 100 problems in our code base now.
Many were likely ODR violations. After ⚙ D77522 [ELF] --warn-backrefs: don't warn for linking sandwich problems
I'll say in 99% detected cases GNU ld/gold will fail to link
("undefined reference to"), the remaining 1% cases are A.a B A2.so
(variant of ODR violation; see the last paragraph of
http://lld.llvm.org/ELF/warn_backrefs.html ) where LLD has a different
symbol resolution.

I have not seen any false positive.

I did ask binutils about the "memoryless" archive processing
https://sourceware.org/pipermail/binutils/2020-September/113194.html
but I think they will unlikely adopt a change.

It looks like the conversation has died, but I just wanted to post my own investigation based on our internal code base. The code base itself is quite a sprawling mass, involving multiple different build systems, some bits CMake based, some hand-curated and so on, and I don't fully comprehend it all. I do know that trying to change it is hard at best, and more likely impossible to do so safely. I ran the build using an LLD with the --warn-backrefs option always enabled and saw several dozen distinct instances of the warning being emitted. I can't say how many instances of this warning are genuine library layering violations that can be easily fixed, but I can say that getting any changes to fix things (even if they are genuine and bad violations) will be non-trivial, and unnecessary turbulence in my opinion.

Yeah I consider the conversation has died. Many(most?) users
subscribing to the list use one linker and the portability with GNU
ld/gold isn't a concern, unlikely the distribution world :wink: