.debug_names tombstoning for local type units

@ayermolo @cmtice @MaskRay

From some previous discussions ( [DebugInfo][DWARFv5][LLD] .debug_names with fdebug-type-sections - #8 by ayermolo ) there’s an issue with tombstoning type units in .debug_names.

Sounded like @ayermolo wasn’t able to find code in lld that does this, and that it’s not the same code that deals with the existing DWARF tombstoning issues involving relocations in .debug_* sections that refer to .text sections (I guess non-ALLOC to ALLOC is the relevant property, and the type unit case is non-ALLOC to non-ALLOC, so goes through some other codepath?).

So, basically, @MaskRay - could you give us any pointers on where this situation would be handled & any other advice on how you’d prefer this be handled?

1 Like

I got a debugger around lld tombstoning a .text reference from .debug_info and the .debug_info from .debug_names using @ayermolo’s .debug_names+type units patch.

Things seem to diverge around the tombstone/symbolicRel/R_DTPREL test (923 in InputSection.cpp - breaking earlier, around 895 so as not to break on each relocation individually)

.text from .debug_info

(run the debugger, maybe add a command to the breakpoint at 895 to print name so it’s easy to continue until you reach the breakpoints for .debug_addr, specifically the second instance (for inl2.cpp - the one that’ll have a tombstone, specifically the second of 2 addresses in the .debug_addr will be tombstoned))
Step down through the for (const RelTy &rel : rels) loop, pass through it once to get past the f2 relocation that won’t be tombstoned, then on the second pass through this loop…
we reach the tombstone condition at 923 with these values:

(gdb) p tombstone
... false/disengaged
(gdb) p isDebug
$36 = true
(gdb) p type
$37 = 1
(gdb) p target.symbolicRel
$38 = 1
(gdb) p expr
$39 = lld::elf::R_ABS

The 1 values for type and target.symbolicRel, I believe, are R_X86_64_64, consistent with llvm-readelf -r output for the object file.

Then we reach the condition at 954, sym.getOutputSection() is false, so we go into the condition and emit a tombstone value per whatever the tombstone logic is.

.debug_info from .debug_names

Break at the same place, print the section name, step through until the second or third .debug_names section (depending on whether you built null.cpp with -gpubnames - not necessary for the repro, but relevant to how to examine the behavior):

b.cpp’s .debug_names will have 5 relocations and the one we are interested in is the second. So step through the first loop and look at the second.

This time the type == target.symbolicRel condition fails:

(gdb) p type
$81 = 10
(gdb) p target.symbolicRel
$82 = 1

(type of 10 is R_X86_64_32)

This is probably the main takeaway:

So for some reason because this is 64 bit code but we only need a 32 bit offset here - the tombstoning logic is skipped.

If we could fix that (not sure if this condition is needed, or could be broadened) - then the follow-on fix would be to add a hardcoded case like the one for isDebugLocOrRanges (which uses 1 in the absence of a tombstone mapping specified) that hardcodes .debug_ranges to tombstone to 0xffffffff. (maybe this would be encoded as a default value in the dedaRelocInNonAlloc flag)

Repro details

.text from .debug_info

For the inline example:
inl.cpp

// this introduces an ODR violation, but is the simplest way
// to make the inline function definitions differ, and require
// a tombstone. (this can come up in non-violating cases too)
inline int f1() { return 1; }
int main() {
  f1();
}

inl2.cpp

inline void f1() { }
void f2() {
  f1();
}

Two files, similar to the type unit case, to demonstrate the inline function being deduplicated. In this case zero is a valid tombstone (well, valid-ish - we’re not dealing with a target that has zero as a valid address - but we’ve discussed/standardized moving towards int-max as a tombstone in DWARFv6).
The tombstoning can be observed here:

$ clang++-tot inl.cpp inl2.cpp -g -fuse-ld=lld && llvm-dwarfdump-tot a.out -debug-addr
a.out:  file format elf64-x86-64

.debug_addr contents:
Address table header: length = 0x00000014, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00
Addrs: [
0x0000000000001730
0x0000000000001740
]
Address table header: length = 0x00000014, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00
Addrs: [
0x0000000000001750
0x0000000000000000
]

.debug_info from .debug_names

null.cpp

void null() { }

a.cpp

struct t1 { };
void f1(t1) { }

b.cpp

struct t1 { };
int main() {
  t1 v1;
}

null.cpp is just used to force the type units to not be at the start of the section, so that a valid TU offset can be distinguished from the currently buggy zero-valued TU offset used when a TU is discarded.
a.cpp and b.cpp are used to show a type unit deduplication, where one needs to be tombstoned but is currently set to zero.
To see the buggy .debug_names output, check this:

$ clang++-tot null.cpp a.cpp b.cpp -g -fdebug-types-section -fuse-ld=lld -gpubnames && llvm-dwarfdump-tot a.out -debug-names | grep LocalTU
    LocalTU[0]: 0x00000030
    LocalTU[0]: 0x00000000

The second LocalTU should be some kind of explicitly invalid value (the zero value is ambiguous/could be a valid TU offset for a TU that is placed at the start of the .debug_info section), such as 0xffffffff.

And to reproduce the debugging sessions, run the compilations with -c first, then run clang with -### to get the linker command line, add --threads=1 (easier to debug single threaded mode) and run that under a debugger, for me I ran (but probably want to do this yourself/not copy these command lines as they have local machine-specific paths that won’t be portable to another machine):
Type units:

$ cgdb -args ~/dev/llvm/build/default/bin/ld.lld "-pie" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "a.out" "/lib/x86_64-linux-gnu/Scrt1.o" "/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/13/crtbeginS.o" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/bin/../lib/x86_64-unknown-linux-gnu" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/lib/clang/18/lib/x86_64-unknown-linux-gnu" "-L/usr/lib/gcc/x86_64-linux-gnu/13" "-L/usr/lib/gcc/x86_64-linux-gnu/13/../../../../lib64" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/../lib64" "-L/lib" "-L/usr/lib" "null.o" "a.o" "b.o" "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/13/crtendS.o" "/lib/x86_64-linux-gnu/crtn.o" --threads=1

Inline functions:

$ cgdb -args ~/dev/llvm/build/default/bin/ld.lld "-pie" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "inl" "/lib/x86_64-linux-gnu/Scrt1.o" "/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/13/crtbeginS.o" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/bin/../lib/x86_64-unknown-linux-gnu" "-L/usr/local/google/home/blaikie/dev/llvm/build/default/lib/clang/18/lib/x86_64-unknown-linux-gnu" "-L/usr/lib/gcc/x86_64-linux-gnu/13" "-L/usr/lib/gcc/x86_64-linux-gnu/13/../../../../lib64" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/../lib64" "-L/lib" "-L/usr/lib" "inl2.o" inl.o "-lstdc++" "-lm" "-lgcc_s" "-lgcc" "-lc" "-lgcc_s" "-lgcc" "/usr/lib/gcc/x86_64-linux-gnu/13/crtendS.o" "/lib/x86_64-linux-gnu/crtn.o" --threads=1

David thanks for the post.

Yeah originally I was looking into void InputSection::relocateNonAlloc also.
So to add a bit more context is when section is COMDAT LLD creates Undefined for symbols in that comdat. I thought we had to do something special somewhere else, but the good news is that in that function it iterates over them also.

Doing simple check seems to do the trick. There is implication that only undefined symbols in .debug_names are for type units. Do we need anything more precise?

const bool isDebugNames = isDebug && name == ".debug_names";
In loop

auto *uds = dyn_cast<Undefined>(&sym);
if (isDebugNames && uds) {
        // 5 is for testing purposes only
        target.relocateNoSym(bufLoc, type, SignExtend64<bits>(5));
        continue;
}

I don’t think we need anything more precise - I’d be happy with any tombstoning in .debug_names to be -1/maxint.

But I don’t think we should be adding a special case for \.debug_names at the top level of this function - ideally we should be going down the tombstoning path and we should have a default value for deadRelocInNonalloc potentially? (or maybe a builtin down in/near the special case for .debug_loc and .debug_ranges - perhaps we can generalize those, not sure - if we can, it’d be good not to allow people to override this behavior, and start with it as -1 and keep it that way for .debug_names at least, even though we have legacy issues with other sections)

Sorry I probably wasn’t clear. It’s not at the top level of the function.
I put up [LLD] Tombstone LocalTU entry in .debug_names by ayermolo · Pull Request #70701 · llvm/llvm-project · GitHub

Kind of what I had in mind. It’s very similar to @cmtice patch.