[lld] Questions regarding linker scripts

Hi,

I have two questions regarding lld behavior related to linker/version scripts.

1. It seems that lld does not strictly obey linker scripts, specifically regarding section collapsing. For example, a directive like this:

  SECTIONS
  {
    [...]
    .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) }
    [...]

I see no collapsing of these sections in the resulting object, with many .rela.text.<mangled_function_name> sections present. Is this expected behavior?

2. Regarding version scripts specifically, they are currently ignored for relocatable objects:

  llvm-project/Driver.cpp at main · llvm/llvm-project · GitHub
  [...]
  // Apply version scripts.
  //
  // For a relocatable output, version scripts don't make sense, and
  // parsing a symbol version string (e.g. dropping "@ver1" from a symbol
  // name "foo@ver1") rather do harm, so we don't call this if -r is given.
  if (!config->relocatable) {
  [...]

I understand the motivation here from the perspective of actual symbol versioning. But an anonymous version script, for example:

{
  global: foo; bar;
  local: *;
}

...would still be useful for relocatable objects, I think. Without this capability, the only way I'm aware of to broadly apply symbol binding is to use objcopy after linking is complete. It can be prohibitively slow. But in fact, that's the approach that FreeBSD takes:

  freebsd-src/kmod.mk at main · freebsd/freebsd-src · GitHub
  freebsd-src/kmod_syms.awk at main · freebsd/freebsd-src · GitHub

Is it reasonable to suggest allowing version scripts for relocatable objects, or providing an option to do so? (FYI I also tried using a VERSION section inside a larger linker script to no avail.)

Both of these behaviors of lld were surprising to me, and since they both relate to linker script input I decided to bundle my questions together.

Thank you for any insight,

Justin

Hi,

I have two questions regarding lld behavior related to linker/version scripts.

1. It seems that lld does not strictly obey linker scripts, specifically regarding section collapsing. For example, a directive like this:

SECTIONS
{
   [...]
   .rela.text : { *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) }
   [...]

I see no collapsing of these sections in the resulting object, with many .rela.text.<mangled_function_name> sections present. Is this expected behavior?

Are you using --emit-relocs? It is an expected behavior that input section descriptions do not match
--emit-relocs copied relocation sections. GNU linkers behave this way, too.

If you are using -r, input section descriptions match .text* but do not work on relocation sections.
I think GNU ld can actually match the first relocation section but cannot combine multiple .rela*.
Combining multiple .rela* do not make sense because these .rela* have different sh_link values (relocated section indexes).

2. Regarding version scripts specifically, they are currently ignored for relocatable objects:

https://github.com/llvm/llvm-project/blob/main/lld/ELF/Driver.cpp
[...]
// Apply version scripts.
//
// For a relocatable output, version scripts don't make sense, and
// parsing a symbol version string (e.g. dropping "@ver1" from a symbol
// name "foo@ver1") rather do harm, so we don't call this if -r is given.
if (!config->relocatable) {
[...]

I understand the motivation here from the perspective of actual symbol versioning. But an anonymous version script, for example:

{
global: foo; bar;
local: *;
}

...would still be useful for relocatable objects, I think. Without this capability, the only way I'm aware of to broadly apply symbol binding is to use objcopy after linking is complete. It can be prohibitively slow. But in fact, that's the approach that FreeBSD takes:

I have checked ld.bfd -r --version-script=a.ver a.o.
It doesn't localize symbols either (it likely ignores --version-script).

https://github.com/freebsd/freebsd-src/blob/main/sys/conf/kmod.mk#L262
https://github.com/freebsd/freebsd-src/blob/main/sys/conf/kmod_syms.awk

Is it reasonable to suggest allowing version scripts for relocatable objects, or providing an option to do so? (FYI I also tried using a VERSION section inside a larger linker script to no avail.)

Note that version IDs except VER_NDX_LOCAL/VER_NDX_GLOBAL do not make sence for ET_REL files. So
the question is whether it makes sense for `local:` to apply to -r links. Given that (1) our
behavior matches GNU ld (2) the --version-script usage can be easily emulated with {,llvm-}objcopy
-G (3) this can add some code complexity to LLD, I am inclined we don't make the extension.

Both of these behaviors of lld were surprising to me, and since they both relate to linker script input I decided to bundle my questions together.

There are awkward points but in general the behaviors look reasonable to me.

Are you using --emit-relocs? It is an expected behavior that input section descriptions do not match
--emit-relocs copied relocation sections. GNU linkers behave this way, too.

I am not using --emit-relocs. I am using -r, so your explanation about that clarifies things for me.

Given that (1) our
behavior matches GNU ld (2) the --version-script usage can be easily emulated with {,llvm-}objcopy
-G (3) this can add some code complexity to LLD, I am inclined we don't make the extension.

Yes, thanks for the explanation, and I agree with your reasoning. It would be a nice convenience for lld to allow the anonymous version script to work in this case, but diverging from generally expected linker behavior probably is not worth it.

I may need to redirect my investigation to why llvm-objcopy is so slow in performing this operation.

There are awkward points but in general the behaviors look reasonable to me.

I agree, thanks to your detailed response. I appreciate the help.

Thanks,

Justin