[DWARF] Different TU with the same hash

Hello
While working on the new BOLT DWARF rewriter I ran into a case where there were two Type Units generated with the same hash, but different length (small repro below). Looking closely only difference was in a DW_TAG_class_type (example below).

After doing some archeological digging through git blame I found this:

My understanding is using identifier instead of doing full hashing should be good enough, and if not fix should be elsewhere.

@dblaikie My question is this an example of deficiency of the current approach? After linker de-duplicates we will end up with one or the other, but re-realistically speaking, does it matter?

main type unit

0x00000000: Type Unit: length = 0x0000003c, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08, name = 'Foo', type_signature = 0x675d23e4f33235f2, type_offset = 0x001e (next unit at 0x00000040)
..
0x0000003a:   DW_TAG_class_type
                DW_AT_name  ("Foo2")
                DW_AT_declaration (true)

helper type unit

x00000000: Type Unit: length = 0x00000040, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08, name = 'Foo', type_signature = 0x675d23e4f33235f2, type_offset = 0x001e (next unit at 0x00000044)
..
0x0000003a:   DW_TAG_class_type
                DW_AT_declaration (true)
                DW_AT_signature (0x49dc260088be7e56)

repro: -gdwarf-4 -g2 -fdebug-types-section -c -o
header

class Foo2;
class Foo {
public:
 Foo2 *f;
};
class Foo2 {
public:
 char *c1;
};

main

#include "header.h"
extern int helper(Foo&);
int main() {
 Foo f;
 return helper(f);
}

helper

#include "header.h"
int helper(Foo &ff) {
 Foo f;
 Foo2 f2;
 return 0;
}

I was able to reproduce this.
In helper.o, there are two type units, for Foo and Foo2; the type unit for Foo records the signature for the Foo2 unit, presumably because it’s available.
In main.o, there’s only one type unit, for Foo; there’s no type unit for Foo2 (because of constructor homing, maybe?) so no signature on the forward declaration of Foo2.

It does seem a little odd, at first glance, but I think it’s functionally not a problem. The type unit from main.o has only a forward declaration of Foo2, while the type unit from helper.o has a real reference to it. But this situation comes up all the time, and if the main.o unit is selected, the debugger should still be able to find the description of Foo2 by name.

@dblaikie has done more with this than I have, but I’m pretty sure the answer is, this is an okay difference, with minimal functional consequences.

1 Like

Yep, looks like you got it covered there @pogo59 - different but equivalent.

I’m pretty sure this different-but-equivalent would occur even if we did do the spec-mandated type unit hashing (rather than relying on the linkage name of the type instead) - the spec-hash involves hashing referenced types by (qualified/scoped) name, so it should produce the same hash regardless of whether you reference a declaration or a definition.

1 Like

Ah I see. Thank you for explaining it.
I thought with full hashing the signature would be different since one of the DW_TAG_class_type has DW_AT_name which is part of the spec, but anyway not super important.