[RFC] better link error messages

Folks,

I’d like propose a new error message format for LLD so that error message for undefined or duplicated symbols are more informative and easy to read.

Below are examples of the current error messages (note that characters in red are actually red on terminal):

Undefined symbols

/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/Writer.cpp:207: undefined symbol ‘lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld::elf::InputSectionBase*)’

Conflicting symbols

/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/Writer.cpp:38: duplicate symbol ‘lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)’

/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/SyntheticSections.cpp:673: previous definition was here

For each error, we want to print out information about 1) symbol name, 2) source file name(s) and source location(s) if available and 3) source object file name(s) and archive file name(s) if available. Currently, these messages lack object file names, and I think the above error messages are a bit hard to grasp because too much information is packed into each line.

I’m thinking of changing the format to something like the following:

Undefined symbols

/ssd/clang/bin/ld.lld: error: undefined symbol: lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld::elf::InputSectionBase*)

Source: /ssd/llvm-project/lld/ELF/Writer.cpp:207

Object: lib/liblldELF.a(Writer.cpp.o)

Conflicting symbols

/ssd/clang/bin/ld.lld: error: duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)

Source 1: /ssd/llvm-project/lld/ELF/Writer.cpp:38

Source 2: /ssd/llvm-project/lld/ELF/SyntheticSections.cpp:673

Object 1: lib/liblldELF.a(Writer.cpp.o)

Object 2 : lib/liblldELF.a(SyntheticSections.cpp.o)

The new error messages contain complete information that the linker is able to gather, and it uses more vertical space to improve readability.

Thoughts?

Folks,

I’d like propose a new error message format for LLD so that error message for undefined or duplicated symbols are more informative and easy to read.

Below are examples of the current error messages (note that characters in red are actually red on terminal):

Undefined symbols

/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/Writer.cpp:207: undefined symbol ‘lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld::elf::InputSectionBase*)’

Conflicting symbols

/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/Writer.cpp:38: duplicate symbol ‘lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)’

/ssd/clang/bin/ld.lld: error: /ssd/llvm-project/lld/ELF/SyntheticSections.cpp:673: previous definition was here

For each error, we want to print out information about 1) symbol name, 2) source file name(s) and source location(s) if available and 3) source object file name(s) and archive file name(s) if available. Currently, these messages lack object file names, and I think the above error messages are a bit hard to grasp because too much information is packed into each line.

I’m thinking of changing the format to something like the following:

Undefined symbols

/ssd/clang/bin/ld.lld: error: undefined symbol: lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld::elf::InputSectionBase*)

Source: /ssd/llvm-project/lld/ELF/Writer.cpp:207

Object: lib/liblldELF.a(Writer.cpp.o)

As a point of comparison, here is what ld64 displays:

Undefined symbols for architecture x86_64:
“lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld::elf::InputSectionBase*)”, referenced from:
foo() in lib/liblldELF.a(Writer.cpp.o)

Being able to display source location when available is a really nice idea!!

Conflicting symbols

/ssd/clang/bin/ld.lld: error: duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)

Source 1: /ssd/llvm-project/lld/ELF/Writer.cpp:38

Source 2: /ssd/llvm-project/lld/ELF/SyntheticSections.cpp:673

Object 1: lib/liblldELF.a(Writer.cpp.o)

Object 2 : lib/liblldELF.a(SyntheticSections.cpp.o)

The new error messages contain complete information that the linker is able to gather, and it uses more vertical space to improve readability.

I feel it is using too much vertical space, and the matching between source and object is lost. What about something like:

/ssd/clang/bin/ld.lld: error: duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)

  • Defined in lib/liblldELF.a(Writer.cpp.o) (from /ssd/llvm-project/lld/ELF/Writer.cpp:38)
  • Defined in lib/liblldELF.a(SyntheticSections.cpp.o) (from /ssd/llvm-project/lld/ELF/SyntheticSections.cpp:673)

What would you print out for undefined symbols? `- Undefined in foo.o (from
...)` looks odd.

Haven’t thought about it, what about something like:

/ssd/clang/bin/ld.lld: error: undefined symbol: lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true> >::addSection(lld::elf::InputSectionBase*)
referenced from foo() at /ssd/llvm-project/lld/ELF/Writer.cpp:207 in lib/liblldELF.a(Writer.cpp.o)

—Mehdi

There's a chance users can be confused about "source" here: it's not
the source code for the symbol, it's the source of the reference.
Maybe "Referenced by" would be better a wording.

I lile the idea of having it more structured and I think your suggested format is the right direction.

I think one principle should be that we assume that file names and symbol names are “really long” (possibly wrapped by the terminal etc.).

So I would modify your suggested format to use the “note” color for the strings like “Object 1” so that even if those lines are wrapped by the terminal then they can still be easily visually parsed. We may also want to do something like “>>> Object 1:” so that places without color (like google’s internal web ui for build logs, or if users are piping the build system’s output into less or tee) are still a bit easier to parse in the presence of wrapped lines.

This principle also suggests (and your suggested format does this) that to avoid having to parse past a symbol/file name to get to other information, there should be at most one symbol/file name on a given line and there should never be anything after it on the line. The ld64 format violates this, for example.

– Sean Silva

+1

I figured you might consider moving the basenames of the filename earlier in the diagnostic, something like:

bin/ld.lld: error: duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)
>>> defined at Writer.cpp:38 in /home/buildslave/buildslave/clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
>>> Writer.cpp.o in archive lib/liblldELF.a
>>> defined at SyntheticSections.cpp:673 in /home/buildslave/buildslave/clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
>>> SyntheticSections.cpp.o in archive lib/liblldELF.a

Oftentimes filenames are unique enough that it tells you where to go immediately. Maybe it’s a bad idea for the source location info, though. You want that to be copy-pastable or clickable in an IDE.

I figured you might consider moving the basenames of the filename earlier
in the diagnostic, something like:

bin/ld.lld: *error:* duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&,
long, lld::elf::RelExpr)
*>>> defined at* Writer.cpp:38 in /home/buildslave/buildslave/clang-cmake-
aarch64-39vma/llvm/tools/lld/ELF/
*>>>* Writer.cpp.o in archive lib/liblldELF.a
*>>> defined at* SyntheticSections.cpp:673 in /home/buildslave/buildslave/
clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
*>>>* SyntheticSections.cpp.o in archive lib/liblldELF.a

Oftentimes filenames are unique enough that it tells you where to go
immediately. Maybe it's a bad idea for the source location info, though.
You want that to be copy-pastable or clickable in an IDE.

That's an interesting idea, and I think I like it.

I really like this one because it also allows to find wrong
debuginfo as an additional bonus.

I find this really nicely readable!

I really like this.

I’m on mobile right now which is a worst case, but this is pretty readable even there. I’ve attached a screenshot. One nit is that we want it to be clear that Writer.cpp.o is in an archive before reading “Writer.cpp.o”. Maybe something like “archive member Writer.cpp.o in …”.

I also think we might want some bold text for the input file lines to clarify better what they are. Maybe “>>> from input file: archive member …” instead of just a blank “>>> archive member …”

– Sean Silva

Please don’t do this (split the base name from the path). Tools that match file names, grep, etc. won’t be able to locate the files easily. I’ve worked with projects that, especially when all library dependencies are included, have lots of files with generic names (e.g. Writer.cpp). This is convenient for relatively self-contained projects where you can recognize the file names. We work on one such project. This is not the general case. If you’d like to make the base name easier to visually differentiate, I recommend just repeating it: I’d be happy with that. Thanks again, Hal

:

I figured you might consider moving the basenames of the filename earlier
in the diagnostic, something like:

bin/ld.lld: *error:* duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&,
long, lld::elf::RelExpr)
*>>> defined at* Writer.cpp:38 in /home/buildslave/buildslave/
clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
*>>>* Writer.cpp.o in archive lib/liblldELF.a
*>>> defined at* SyntheticSections.cpp:673 in /home/buildslave/buildslave/
clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
*>>>* SyntheticSections.cpp.o in archive lib/liblldELF.a

Please don't do this (split the base name from the path). Tools that match
file names, grep, etc. won't be able to locate the files easily. I've
worked with projects that, especially when all library dependencies are
included, have lots of files with generic names (e.g. Writer.cpp). This is
convenient for relatively self-contained projects where you can recognize
the file names. We work on one such project. This is not the general case.

If you'd like to make the base name easier to visually differentiate, I
recommend just repeating it:

bin/ld.lld: *error:* duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&,
long, lld::elf::RelExpr)
*>>> defined at* Writer.cpp:38 (/home/buildslave/buildslave/c
lang-cmake-aarch64-39vma/llvm/tools/lld/ELF/Writer.cpp:38)
*>>>* Writer.cpp.o in archive lib/liblldELF.a
*>>> defined at* SyntheticSections.cpp:673 (/home/buildslave/buildslave/c
lang-cmake-aarch64-39vma/llvm/tools/lld/ELF/SyntheticSections.cpp:673)
*>>>* SyntheticSections.cpp.o in archive lib/liblldELF.a

I'd be happy with that.

Thanks again,
Hal

I agree with Hal, repeating the source file name will make it just a

little bit longer, but it will help much with usability (copy pasting etc)

I hadn’t noticed that detail, but I totally agree that the full path should be present in the diagnostic.

I figured you might consider moving the basenames of the filename earlier
in the diagnostic, something like:

bin/ld.lld: *error:* duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&,
long, lld::elf::RelExpr)
*>>> defined at* Writer.cpp:38 in /home/buildslave/buildslave/
clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
*>>>* Writer.cpp.o in archive lib/liblldELF.a
*>>> defined at* SyntheticSections.cpp:673 in /home/buildslave/buildslave/
clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/
*>>>* SyntheticSections.cpp.o in archive lib/liblldELF.a

Please don't do this (split the base name from the path). Tools that match
file names, grep, etc. won't be able to locate the files easily. I've
worked with projects that, especially when all library dependencies are
included, have lots of files with generic names (e.g. Writer.cpp). This is
convenient for relatively self-contained projects where you can recognize
the file names. We work on one such project. This is not the general case.

If you'd like to make the base name easier to visually differentiate, I
recommend just repeating it:

bin/ld.lld: *error:* duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&,
long, lld::elf::RelExpr)
*>>> defined at* Writer.cpp:38 (/home/buildslave/buildslave/c
lang-cmake-aarch64-39vma/llvm/tools/lld/ELF/Writer.cpp:38)
*>>>* Writer.cpp.o in archive lib/liblldELF.a
*>>> defined at* SyntheticSections.cpp:673 (/home/buildslave/buildslave/c
lang-cmake-aarch64-39vma/llvm/tools/lld/ELF/SyntheticSections.cpp:673)
*>>>* SyntheticSections.cpp.o in archive lib/liblldELF.a

I'd be happy with that.

Thanks again,
Hal

+1

- Michael Spencer

Put it all together, the following error messages should work for everybody. I’ll create a patch to make this change and send it for review. Thank you guys for the inputs!

Undefined symbol error:

bin/ld.lld: error: undefined symbol: lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0, true>
>>> defined at Writer.cpp:207 (/ssd/llvm-project/lld/ELF/Writer.cpp:207)
>>> Writer.cpp.o in archive lib/liblldELF.a

Duplicate symbol error:

bin/ld.lld: error: duplicate symbol: lld::elf::MipsGotSection::addEntry(lld::elf::SymbolBody&, long, lld::elf::RelExpr)
>>> defined at Writer.cpp:38 (/home/buildslave/buildslave/clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/Writer.cpp:38)
>>> Writer.cpp.o in archive lib/liblldELF.a
>>> defined at SyntheticSections.cpp:673 (/home/buildslave/buildslave/clang-cmake-aarch64-39vma/llvm/tools/lld/ELF/SyntheticSections.cpp:673)
>>> SyntheticSections.cpp.o in archive lib/liblldELF.a

Put it all together, the following error messages should work for
everybody. I'll create a patch to make this change and send it for
review. Thank you guys for the inputs!

Undefined symbol error:

bin/ld.lld: error: undefined symbol:
lld::elf::EhFrameSection<llvm::object::ELFType<(llvm::support::endianness)0,
>
*>>> defined at* Writer.cpp:207 (/ssd/llvm-project/lld/ELF/Writer.cpp:207)
*>>>* Writer.cpp.o in archive lib/liblldELF.a

The wording of this one is mildly self contradictory. Undefined symbols aren't defined, they're... used? referenced?

Jon

Something to maybe keep in mind is that we might “want” to emit the errors/notes in Visual Studio format (sorry :-P)

If a line starts with:
Directory/File.cpp(42): error, blabla!

It will be picked up by the error list window and also be double-clickable in the output window.

I’m not asking for a feature request, but it would be nice to have this in mind when designing the API for error reporting.

Thank you,

Filipe