llvm-dwarfdump offsets

llvm-dwarfdump currently outputs the offset of each DIE relative to the entire debug_info section. But type/DIE references within a unit are relative to that unit.

Should we emit unit-relative offsets instead?

I’ve prototyped this and end up with output something like this:

0x00000051: DW_TAG_base_type [24]
DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000051] = “int”)
DW_AT_encoding [DW_FORM_data1] (0x05)
DW_AT_byte_size [DW_FORM_data1] (0x04)

0x00000058: NULL
0x000001ec: Compile Unit: length = 0x0000002b version = 0x0004 abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000021b)

0x0000000b: DW_TAG_type_unit [25] *
DW_AT_language [DW_FORM_data2] (0x0004)

0x0000000e: DW_TAG_namespace [9] *

It is a bit weird using the same syntax to print the section relative offest for the unit header, then unit-relative offsets for the DIEs, but other than that this seems to me like a useful improvement.

(I was going to use this change to help me print out the unit-relative offsets for types in type units to test pubnames + type units - obviously in the end the type units will be in their own section and unit relative and section relative will be synonymous again, but the change seems like goodness regardless)

yes/no/maybe?

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On Behalf Of David Blaikie
Sent: Monday, November 25, 2013 7:24 PM
To: LLVM Developers Mailing List; Eric Christopher
Subject: [LLVMdev] llvm-dwarfdump offsets

llvm-dwarfdump currently outputs the offset of each DIE relative to the
entire debug_info section. But type/DIE references within a unit are
relative to that unit.

Should we emit unit-relative offsets instead?

I've prototyped this and end up with output something like this:

0x00000051: DW_TAG_base_type [24]
DW_AT_name [DW_FORM_strp] ( .debug_str[0x00000051] = "int")

Hopefully this second '51' is really the section-relative offset into
.debug_str and only coincidentally the same as the DIE offset.

            DW\_AT\_encoding \[DW\_FORM\_data1\]  \(0x05\)
            DW\_AT\_byte\_size \[DW\_FORM\_data1\] \(0x04\)

0x00000058: NULL
0x000001ec: Compile Unit: length = 0x0000002b version = 0x0004
abbr_offset = 0x0000 addr_size = 0x08 (next unit at 0x0000021b)

0x0000000b: DW_TAG_type_unit [25] *
DW_AT_language [DW_FORM_data2] (0x0004)

0x0000000e: DW_TAG_namespace [9] *

It is a bit weird using the same syntax to print the section relative
offest for the unit header, then unit-relative offsets for the DIEs,
but other than that this seems to me like a useful improvement.

You could use a different syntax; <0x0000000e> or something?
The unit-relative offsets are not really inherently interesting,
they're basically labels for the DIEs. There's some minor potential
for confusion if you have multiple units and happen to have DIEs
at the same unit-relative offset in different units, but that
doesn't feel like it's worth worrying about.

(I was going to use this change to help me print out the unit-relative
offsets for types in type units to test pubnames + type units - obviously
in the end the type units will be in their own section and unit relative
and section relative will be synonymous again, but the change seems like
goodness regardless)

They'd only become synonymous in .o files, they'd still be different in
a linked executable.

yes/no/maybe?

Okay with me.
--paulr

Short answer: I'm uncertain.

Longer answer:

We are pretty much at the moment, though it's not coded up that way.
The type units are (currently) supposed to be in their own section so
in a sense we're doing what you want. The problem is that you'd want
an offset for some things, but not others. I can see wanting it for
things that are references within the same unit, but something that's
a section offset I'd definitely want to see the whole thing. It also
seems like it could be somewhat complicating figuring out what's what.

Mostly I don't think it's worth the effort and could be confusing when
you're looking for something at a particular offset.

-eric