I got a debugger around lld tombstoning a .text
reference from .debug_info
and the .debug_info
from .debug_names
using @ayermolo’s .debug_names
+type units patch.
Things seem to diverge around the tombstone
/symbolicRel
/R_DTPREL
test (923 in InputSection.cpp
- breaking earlier, around 895 so as not to break on each relocation individually)
.text
from .debug_info
(run the debugger, maybe add a command
to the breakpoint at 895 to print name
so it’s easy to continue until you reach the breakpoints for .debug_addr
, specifically the second instance (for inl2.cpp
- the one that’ll have a tombstone, specifically the second of 2 addresses in the .debug_addr
will be tombstoned))
Step down through the for (const RelTy &rel : rels)
loop, pass through it once to get past the f2
relocation that won’t be tombstoned, then on the second pass through this loop…
we reach the tombstone
condition at 923 with these values:
(gdb) p tombstone
... false/disengaged
(gdb) p isDebug
$36 = true
(gdb) p type
$37 = 1
(gdb) p target.symbolicRel
$38 = 1
(gdb) p expr
$39 = lld::elf::R_ABS
The 1
values for type
and target.symbolicRel
, I believe, are R_X86_64_64
, consistent with llvm-readelf -r
output for the object file.
Then we reach the condition at 954, sym.getOutputSection()
is false, so we go into the condition and emit a tombstone value per whatever the tombstone logic is.
.debug_info
from .debug_names
Break at the same place, print the section name, step through until the second or third .debug_names
section (depending on whether you built null.cpp
with -gpubnames
- not necessary for the repro, but relevant to how to examine the behavior):
b.cpp
’s .debug_names
will have 5 relocations and the one we are interested in is the second. So step through the first loop and look at the second.
This time the type == target.symbolicRel
condition fails:
(gdb) p type
$81 = 10
(gdb) p target.symbolicRel
$82 = 1
(type
of 10
is R_X86_64_32
)
This is probably the main takeaway:
So for some reason because this is 64 bit code but we only need a 32 bit offset here - the tombstoning logic is skipped.
If we could fix that (not sure if this condition is needed, or could be broadened) - then the follow-on fix would be to add a hardcoded case like the one for isDebugLocOrRanges
(which uses 1
in the absence of a tombstone mapping specified) that hardcodes .debug_ranges
to tombstone to 0xffffffff
. (maybe this would be encoded as a default value in the dedaRelocInNonAlloc
flag)
Repro details
.text
from .debug_info
For the inline example:
inl.cpp
// this introduces an ODR violation, but is the simplest way
// to make the inline function definitions differ, and require
// a tombstone. (this can come up in non-violating cases too)
inline int f1() { return 1; }
int main() {
f1();
}
inl2.cpp
inline void f1() { }
void f2() {
f1();
}
Two files, similar to the type unit case, to demonstrate the inline function being deduplicated. In this case zero is a valid tombstone (well, valid-ish - we’re not dealing with a target that has zero as a valid address - but we’ve discussed/standardized moving towards int-max as a tombstone in DWARFv6).
The tombstoning can be observed here:
$ clang++-tot inl.cpp inl2.cpp -g -fuse-ld=lld && llvm-dwarfdump-tot a.out -debug-addr
a.out: file format elf64-x86-64
.debug_addr contents:
Address table header: length = 0x00000014, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00
Addrs: [
0x0000000000001730
0x0000000000001740
]
Address table header: length = 0x00000014, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00
Addrs: [
0x0000000000001750
0x0000000000000000
]
.debug_info
from .debug_names
null.cpp
void null() { }
a.cpp
struct t1 { };
void f1(t1) { }
b.cpp
struct t1 { };
int main() {
t1 v1;
}
null.cpp
is just used to force the type units to not be at the start of the section, so that a valid TU offset can be distinguished from the currently buggy zero-valued TU offset used when a TU is discarded.
a.cpp
and b.cpp
are used to show a type unit deduplication, where one needs to be tombstoned but is currently set to zero.
To see the buggy .debug_names
output, check this:
$ clang++-tot null.cpp a.cpp b.cpp -g -fdebug-types-section -fuse-ld=lld -gpubnames && llvm-dwarfdump-tot a.out -debug-names | grep LocalTU
LocalTU[0]: 0x00000030
LocalTU[0]: 0x00000000
The second LocalTU
should be some kind of explicitly invalid value (the zero value is ambiguous/could be a valid TU offset for a TU that is placed at the start of the .debug_info
section), such as 0xffffffff
.
And to reproduce the debugging sessions, run the compilations with -c first, then run clang with -### to get the linker command line, add --threads=1
(easier to debug single threaded mode) and run that under a debugger, for me I ran (but probably want to do this yourself/not copy these command lines as they have local machine-specific paths that won’t be portable to another machine):
Type units:
$ cgdb -args ~/dev/llvm/build/default/bin/ld.lld "-pie" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "a.out" "/lib/x86_64-linux-gnu/Scrt1.o" "/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/13/crtbeginS.o" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/bin/../lib/x86_64-unknown-linux-gnu" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/lib/clang/18/lib/x86_64-unknown-linux-gnu" "-L/usr/lib/gcc/x86_64-linux-gnu/13" "-L/usr/lib/gcc/x86_64-linux-gnu/13/../../../../lib64" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/../lib64" "-L/lib" "-L/usr/lib" "null.o" "a.o" "b.o" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/13/crtendS.o" "/lib/x86_64-linux-gnu/crtn.o" --threads=1
Inline functions:
$ cgdb -args ~/dev/llvm/build/default/bin/ld.lld "-pie" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "inl" "/lib/x86_64-linux-gnu/Scrt1.o" "/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/13/crtbeginS.o" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/bin/../lib/x86_64-unknown-linux-gnu" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/lib/clang/18/lib/x86_64-unknown-linux-gnu" "-L/usr/lib/gcc/x86_64-linux-gnu/13" "-L/usr/lib/gcc/x86_64-linux-gnu/13/../../../../lib64" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/../lib64" "-L/lib" "-L/usr/lib" "inl2.o" inl.o "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/13/crtendS.o" "/lib/x86_64-linux-gnu/crtn.o" --threads=1