LTO and Symbol duplication


When compiling with GCC (-flto) it does introduce new symbols to binary to prevent collusion [1]. I was looking at the LLVM source code to find out whether there are a similar mechanism in LLVM too or not? I noticed there are duplicate symbols like this:

# compiled with llvm-12 and (-flto)
$ readelf --wide --symbols executable | grep "FUNC" | grep -i "close_file"
  75: 000000000048a800    37 FUNC    LOCAL  DEFAULT   13 close_file
  76: 000000000048ad10    34 FUNC    LOCAL  DEFAULT   13 close_file.614
  77: 000000000048b320    34 FUNC    LOCAL  DEFAULT   13 close_file.622
  78: 000000000048b420    22 FUNC    LOCAL  DEFAULT   13 close_file.634
  79: 000000000048e120   713 FUNC    LOCAL  DEFAULT   13 close_file.641
  80: 0000000000490690   485 FUNC    LOCAL  DEFAULT   13 close_file.663

But I am not sure is this same thing or different. 1) Why does LLVM needs to duplicate/rename symbols? 2) Where can I read more about this in the documentation or source code?

  1. gcc/ at 344e6f9f2abcff9b2bb4b26b693be4a599272f43 · gcc-mirror/gcc · GitHub

When LLVM merges two bitcode modules together, they can have internal symbols with conflicting names. The symbols are renamed to allow emitting them into the same ELF object file, like you’ve guessed.

See llvm-project/IRMover.cpp at c79ab1065e89872668b8d43c747ff3e5974b0d96 · llvm/llvm-project · GitHub in the source code. This isn’t documented anywhere because the exact renaming scheme shouldn’t matter for users.

1 Like