Conversion of text symbols from global to local

Hello,

I’ve realized that the symbol map that’s generated as part of the Xen build process is not correct when using LLD. I’ve tracked this down to LLD changing the visibility of almost all the text symbols from global to local in the last linking step using a linker file in order to generate the Xen binary.

This is different from GNU ld, that will keep those symbols as global, and hence creates issues with the post-processing that’s done on the resulting binary. The rune used to generate the binary is:

$ ld -melf_x86_64_fbsd -T arch/x86/xen.lds -N prelink.o -o xen-syms

prelink.o has the expected symbol visibility, but xen-syms does not. Is there anyway to prevent this behaviour and preserve the symbol visibility from the input binaries?

One way I’ve found to workaround this is to create a list of global text symbols from prelink.o and then use objcopy’s --globalize-symbols option to restore the symbol visibility in xen-syms, but that’s kind of clumsy and would like to avoid it if there’s any other option.

Thanks.

Just a few questions to see if I can narrow down when this might be happening.

I’m assuming symbol map is the static symbol table of the output ELF file?

I’m assuming you mean symbol binding rather than visibility? Visibility is something like STV_DEFAULT or STV_HIDDEN whereas binding is STB_GLOBAL and STB_LOCAL.

In the general case LLD shouldn’t change the symbol binding of a symbol that comes from an object file. One way I know of that is to use a symbol versioning script with a local command such as:
VER {
local:
*;
};

I can see that with --version-script=symver.script I can make the symbols matched by the local: command into STB_LOCAL, whereas GNU ld keeps the binding STB_GLOBAL.

Does your xen.lds file contain VERSION { version script contents } with a local: ?

The GNU documentation VERSION (LD) mentions local in a few places. It does mention that symbols are reduced to local scope, but doesn’t explicitly mention binding.

If this is the root cause then I’m not sure if there is an easy workaround assuming that you genuinely need the local: when creating the binary. In general I’d only expect symbol versioning to be used when creating a shared object though. One possibility could be to try and achieve what you need with local: using an alternative means such as using hidden visibility for the globals in the object file.

If it is something else then it may be worth posting a reproducer in a github issue.

Hello,

I would like to reply inline, but I’m afraid I don’t know how to do it with Discourse, sorry.

What I’ve referred as the symbol map would be the output of symbols from nm.

Yes, I was referring to binding rather than visibility. It’s indeed STB_GLOBAL vs STB_LOCAL AFAICT. Which is what makes nm output capital or lowercase for symbol types.

No, my linker script doesn’t contain any VERSION command.

I’ve uploaded the linker script here:

https://xenbits.xen.org/people/royger/test.lds

And the input object file:

https://xenbits.xen.org/people/royger/test.o

The command I use is:

$ ld --version
LLD 13.0.0 (FreeBSD llvmorg-13.0.0-0-gd7b669b3a303-1400002) (compatible with GNU linkers)
$ ld -melf_x86_64_fbsd -T test.lds -N test.o -o test

I’ve tried to reduce to a simpler object/linker script file, but I don’t seem to be able to reproduce then. So I guess there’s some relation between the linker and object files that triggers the conversion from global to local. You can pick a random symbol like ‘xmem_pool_alloc’ and see that it’s ‘T’ in the input (test.o) and ‘t’ on the output (test).

The same doesn’t happen when ld is GNU ld.

Thanks, Roger.

Thanks very much for the test case. It looks like the symbols are being output with STB_LOCAL binding due to the symbol visibility being STV_HIDDEN.

I have been able to able to reproduce with the assembler:

.hidden foo
foo:
  .word 0

This comes from computeBinding in llvm-project/Symbols.cpp at main · llvm/llvm-project · GitHub

I’m not entirely sure what the justification for doing that is. It is true that hidden visibility is not visible outside the shared-library/executable, but changing the binding shouldn’t be necessary.

I can’t think of an easy workaround for you. In theory if zen isn’t a shared library you don’t need to use hidden visibility so you could recompile with -visibility=default or use attribute((visibility(“default”))) for the symbols that you need to post-process.

It will be worth raising a github issue to see if you can get an option put in to stop LLD from changing the binding. I’m guessing it will have been done for a reason, possibly expediency of implementation.

This behaviour is mandated by the ELF gabi. See the following discussion: https://groups.google.com/g/generic-abi/c/tSU1wY. Globals turned locals can be identified by their hidden visibility.

The generic ABI says:

A hidden symbol contained in a relocatable object must be either removed or converted to STB_LOCAL binding by the link-editor when the relocatable object is included in an executable file or shared object.

Both gold and ld.lld change the binding of a STV_HIDDEN symbol to STB_LOCAL, conforming to the specification.
I know that GNU ld uses the STB_LOCAL behavior but does not actually use STB_LOCAL for the output.
If Xen is relying on the output of STB_GLOBAL STV_HIDDEN, I think it needs to be fixed.
The fix is probably straightforward.

With the code structure of how ld.lld computes binding, it’s difficult to adjust Symbol::computeBinding() to not cause slowdown.
Given what the ELF specification says, I think ld.lld should stay with the current behavior.

In most cases, you may assume that STB_LOCAL STV_HIDDEN symbol indicate a non-local symbol in the relocatable object file. (.hidden foo; foo: with an optional .local foo produces a local hidden symbol in the relocatable object file, but people are not couraged to do this. .hidden should just be removed).

You can pick a random symbol like ‘xmem_pool_alloc’ and see that it’s ‘T’ in the input (test.o) and ‘t’ on the output (test).

From the description it seems that Xen does something non-trivial with the symbols in the output. In this case using readelf/llvm-readelf would probably be better than nm/llvm-nm.