I see the GNU ld behavior as a limitation, not as a feature, as Peter
Smith
also pointed out in https://reviews.llvm.org/D86762. While it can be
argued
that there are certain cases where it can help detect layering
violations as you mentioned in your change, I’m not sure how valuable that
is in practice. Every case I’ve encountered so far either in Chrome or in
Fuchsia was a valid use case, most commonly interceptors. The solution
has always been the same, wrap all libraries in --start-group/–stop-group
and it’s what most projects do by default to avoid dealing with these
issues, see for example [Chromium](
https://source.chromium.org/chromium/chromium/src/+/master:build/toolchain/gcc_toolchain.gni;l=409
).
In our case, compatibility with linkers on other platforms is more
important than compatibility with GNU ld, so I’d prefer to keep the
current
behavior. Projects that care about compatibility with GNU ld can use
–warn-backrefs.
I totally understand that some users may not want to deal with GNU ld
compatibility:) I’ll then question about Chromium’s addition of -z defs:
https://crrev.com/843583006 
-z defs is like a layering checking tool for shared objects while
–warn-backrefs is for archives. For performance, ABI concerns and ease
of deployment, many projects tend to build their own components as
archives instead of shared objects. In this sense --warn-backrefs will
probably be more useful than -z defs.
(
TIL lorder and tsort were created to define an order of archives in
early versions of Unix.
https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html
It seems that the article missed the point that proper library layering is
still useful
)
I’m not a fan of this idea of reframing GNU ld behavior as a “layering
checking tool”. It is an incomplete layering checking tool because it does
not detect the scenario where, for example, you have the intended
dependency graph:
A → B
A → C
B → D
C → D
(resulting in -la -lb -lc -ld) and you have an unexpected dependency B →
C.
Yes, the GNU ld layering checking behavior is incomplete (yet important
and sufficient if we aim for compatibility).
As Petr mentioned, only certain users care about this aspect of GNU ld compatibility, and those users can just turn this feature on.
The build system pick two orders:
-la -lb -lc -ld
-la -lc -lb -ld
Unless you have multiple programs linking against A, you’ve just introduced non-determinism in your build tool, which is generally considered to be a Bad Thing.
There is already a way to detect layering problems, that detects
practical layering problems and not just theoretical ones, which is to link
programs that use subsets of the libraries. For example, linking a program
that depends only on B would result in detecting the invalid B → C
dependency.
This is actually cumbersome and is explicitly described in https://reviews.llvm.org/D86762
Right, and as I mention below even that doesn’t catch all cases. My point is that there is already a way to detect “practical” layering problems, defined as “layering problems that cause an undefined symbol error in programs that are linked at the same time as the library is built”.
- Users who don’t care about GNU ld can link their libraries normally.
- Users who care about GNU ld can pass --warn-backrefs.
Users who care more about “theoretical” layering problems, such as users who ship prebuilt archive files to customers, will not be satisfied by --warn-backrefs, as this will not catch every possible layering problem before shipping the library, as they will not have a copy of every customer’s program. Instead, they will be better served by a separate tool.
If B → C is not specified,
- If people write B_test (“linking a program that depends only on B”),
they will notice the dependency issue immediately via the “undefined symbol” diagnostic.
- If such a B_test does not exist. The user may work on a large
application which depends on B (and transitively on D) but not on A.
OK, they will get an undefined symbol from C. However, it is not
appropriate for them to add a dependency on C because they don’t use C
directly. See the “If adding the dependency does not form a cycle:”
paragraph in D86762.
If their application actually depends on A (thus they will get the
dependency on C), their link may or may not succeed with GNU ld,
depending on whether the build system picks
-la -lb -lc -ld or -la -lc -lb -ld, and the diagnostic can be very
confusing “undefined reference to” while C is actually linked.
It’s also worth noting that even that would only detect the
layering problem if the program depends on the part of B that depends on C.
A better way to go about achieving layering checking would IMHO be to
implement a separate tool (not part of the linker) that is capable of
a complete layering check. Such a tool would only depend on symbol table
features common to all object formats, so it could probably be implemented
generically.
Peter
A standalone tool will not achieve sufficient ergonomics.
It will read all input files and duplicate the symbol resolution work done by the linker,
which can be slow. (ld.lld --warn-backrefs only imposes negligible
overhead when a lazy object is fetched.)
As I mentioned, it would do additional work that the linker is not currently doing, which is necessary to implement a complete layering checking tool, such as reading all archive members out of each archive (whereas linkers only need to read depended-on archive members), and which linkers should not be doing by default exactly for performance reasons. Furthermore, the additional tool can be moved out of the critical edit-build-run path, unlike a linker based tool, which should improve performance as well.
An additional tool would give the flexibility of allowing the interface to specify the actual dependencies instead of just giving us only what we can achieve as a result of historical design decisions. It would free build tools like cmake or gn from needing to topologically sort libraries in order to implement layering checking; instead, they can simply dump their dependency graph.
IMHO, selling this feature as a layering checker is worse than having no layering checker at all, because it will mislead users into thinking “oh, I’m using lld’s layering checker, that must mean that my program is properly layered”.
Peter