[LLD] Linking static library does not resolve symbols as gold/ld

Compilers don't know about functions that are not defined in the same
compilation unit, so they leave call instruction operands as zero (because
they can't compute any absolute nor relative address of the destinations),
and let linkers fix the address by binary patching.

So, what you are seeing is likely a bug of LLD that it fails to fix the
address for some reason.

Can you dump that function with `objdump -d -r that-file.o`? With the -r
option, objdump prints out relocation records. Relocation records are the
information that linkers use to fix addresses.

Here is the relevant output:

0000000000013832 <func()>:
   13832: 55 push %rbp
   13833: 48 89 e5 mov %rsp,%rbp
   13836: 53 push %rbx
   13837: 48 83 ec 18 sub $0x18,%rsp
   1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp)
   1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax
   13843: 48 89 c7 mov %rax,%rdi
   13846: e8 00 00 00 00 callq 1384b <func()+0x19>
                        13847: R_X86_64_PLT32 std::vector<record,
std::allocator<record> >::vector()-0x4
   ....

Let me know if more is needed.

I recall that this object file is created in a bit unusual way, something
like partially linking several other object files together into this one,
but I will have to dig deeper to say for sure.

Best regards
Martin

Rui Ueyama wrote:

Here is the relevant output:

0000000000013832 <func()>:
   13832: 55 push %rbp
   13833: 48 89 e5 mov %rsp,%rbp
   13836: 53 push %rbx
   13837: 48 83 ec 18 sub $0x18,%rsp
   1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp)
   1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax
   13843: 48 89 c7 mov %rax,%rdi
   13846: e8 00 00 00 00 callq 1384b <func()+0x19>
                        13847: R_X86_64_PLT32 std::vector<record,
std::allocator<record> >::vector()-0x4
   ....

This seems a bit odd. You have type `record` and instantiate std::vector
with `record`. Usually the instantiated template function is in the same
compilation unit, and the relocation type is R_X86_64_PC32, not
R_X86_64_PLT32.

Let me know if more is needed.

I recall that this object file is created in a bit unusual way, something
like partially linking several other object files together into this one,
but I will have to dig deeper to say for sure.

Yes, it looks like the object file is created in an unusual way, and that
revealed a subtle difference between ld.gold and ld.lld. I want to know
more about that.

Hi Rui,

fyi I'm still working on a reproducer I can share.

Here is the relevant output:

0000000000013832 <func()>:
   13832: 55 push %rbp
   13833: 48 89 e5 mov %rsp,%rbp
   13836: 53 push %rbx
   13837: 48 83 ec 18 sub $0x18,%rsp
   1383b: 48 89 7d e8 mov %rdi,-0x18(%rbp)
   1383f: 48 8b 45 e8 mov -0x18(%rbp),%rax
   13843: 48 89 c7 mov %rax,%rdi
   13846: e8 00 00 00 00 callq 1384b <func()+0x19>
                        13847: R_X86_64_PLT32 std::vector<record,
std::allocator<record> >::vector()-0x4
   ....

This seems a bit odd. You have type `record` and instantiate std::vector
with `record`. Usually the instantiated template function is in the same
compilation unit, and the relocation type is R_X86_64_PC32, not
R_X86_64_PLT32.

It seems to me R_X86_64_PLT32 is not so unusual in this case, e.g. -fPIC
already produces this relocation:

$ cat example.cpp
#include <vector>
#include <string>

class PropertyReader
{
public:
    struct record
    {
      std::string a;
      std::string b;
    };
    PropertyReader();
private:
    std::vector<record> records;
};

PropertyReader::PropertyReader() : records()
{
}

$ g++ -fPIC -c example.cpp -o example.o
$ objdump -d -r -C example.o
...
0000000000000000 <PropertyReader::PropertyReader()>:
   0: 55 push %rbp
   1: 48 89 e5 mov %rsp,%rbp
   4: 48 83 ec 10 sub $0x10,%rsp
   8: 48 89 7d f8 mov %rdi,-0x8(%rbp)
   c: 48 8b 45 f8 mov -0x8(%rbp),%rax
  10: 48 89 c7 mov %rax,%rdi
  13: e8 00 00 00 00 callq 18
<PropertyReader::PropertyReader()+0x18>
                        14: R_X86_64_PLT32
std::vector<PropertyReader::record,
std::allocator<PropertyReader::record>

::vector()-0x4

  18: 90 nop
  19: c9 leaveq
  1a: c3 retq
...

But linking such an object file with lld does not produce the original
error so something else is going on.

Hi Martin,

It’s hard to tell what is wrong only with the information. If that is an open-source program, can you give me a link to that so that I can try? If that’s a proprietary software you cannot share with me, you might want to produce small reproducible test case.

Hi Rui,

I finally managed to come up with a reduced example, please find it
attached. You need to have GOLDPATH and LLDPATH set to point to the
respective linkers.

What happens in build.sh is that an object file is partially linked ("-u")
with gold first, then this is linked with lld to another object file for
the final executable. The resulting executable 'repro' then crashes during
static initialization.

The following changes make it work:
1) Using ld instead of gold for the first step
2) Using ld or gold for the second step

2) makes me think there must be something those linkers are doing, but lld
is not, that makes the whole thing work. But note that the crash happens
in a constructor. I found this for the "-u" option in the ld manpage here:

"When linking C++ programs, this option will not resolve references to
constructors; to do that, use -Ur."

However, gold does not know that option (and ld already works without it)

Any idea what is going wrong here?

Thanks and best regards
Martin

lld_repro.tar (10 KB)

Hi Martin,

Thank you for sending the script. I can reproduce the issue with it. It looks like the program crashes when it tries to call std::vector's ctor from a static initializer. I don’t fully understand what is causing the issue yet, but here are my observations.

  • Since you are creating a temporary object file using ld.gold -r, your object file contains multiple weak definitions with the same name, as two or more input files for ld.gold -r contains the same template instantiations. This is not immediately an error, and LLD should pick one of them for each unique name, but this might not be workingw ell.

  • If you create a temporary object file using ld.lld -r, it should work. I don’t know why, though.

I’ll continue investigating.

Rui Ueyama via llvm-dev <llvm-dev@lists.llvm.org> writes:

I'll continue investigating.

I reduced this to just