I am a DWARF newbie. We’re upgrading to a newer LLVM and suddenly lldb couldn’t print correct values for variables and llvm-dwarfdump shows no debug info for variables.
After some poking around I came to the conclusion that the problem is rooted in changes to how LLVM computes DIE sizes. Previously it used MCAsmInfo’s CodePointerSize and now it uses the target pointer size from DataLayout. HOWEVER, it encodes the address size in the compilation unit using CodePointerSize. These happen to be different on our target. I imagine calculating sizes using one value and encoding a quite different value in the compilation unit’s address size is not going to end well.
Is my suspicion correct?
What is the meaning of CodePointerSize? There is no documentation on it that helps explain how it’s used. It seems at least somewhat related to the DWARF notion of address size, since that’s what’s encoded in the compilation unit. Is it used for anything else?
Why was DWARF generation changed to use the target pointer size to compute DIE sizes when before it used the CodePointerSize? Is there an implicit assumption that target pointer size == CodePointerSize?
DWARF has several places where the spec says that a value is the size of an address on the target. There’s an implicit assumption that code addresses and data addresses are the same size. Apparently that is not always true?
Sorry I can’t provide any immediate insight regarding CodePointerSize.
It turns out that when I changed things back to using CodePointerSize to compute DIE sizes, it resolved the issue and everything works. I suspect it would also work if the target address size were encoded in the DWARF compilation unit instead of CodePointerSize. In any case, it seems wrong to use two different values in these places.
Would the correct thing to do be to encode the target address size in the DWARF compilation unit DIE?
I noticed some other targets have a different CodePointerSize than target address size, for example NVPTX, Sparc in 32-bit mode and Mips in 32-bit mode. I wonder if DWARF is also broken on these targets. I’m not even sure NVPTX supports DWARF.
CodePointerSize should be the size of an address of a pointer to code (whatever the native notion of a pointer to code is). That should normally be the same as the size of a pointer in the address space DataLayout::getProgramAddressSpace(). I guess those could be different if your target uses some exotic representation for function pointers.
There shouldn’t be any hardcoded assumption that pointers in other address spaces have the size CodePointerSize. (I think GPU targets use DW_AT_address_class to describe address spaces for data pointers.)
In the DWARF that ends up in the object file, the compilation unit header (not the compilation unit DIE) does have a field containing “the size in bytes of an address on the target architecture.” I think one question here is consistency between the place where LLVM decides what to put in that field, and what it uses elsewhere to decide the sizes of things. LLVM internally needs to use the same value for all the places that are “the size of an address on the target architecture” in the DWARF spec.
If the size of a code address and the size of a data address can be different, it’s probably best to use the larger as the “target architecture” size.
Confusingly, the size of a pointer type is up to the compiler, and doesn’t have to be the same as the address size. I’ve worked with systems that had multiple pointer sizes (usually for backwards legacy compatibility), typically allowing both 32-bit and 64-bit pointer variables on a 64-bit system with roots on 32-bit hardware.
I’ve seen patches go by that nail it to DWARF v2, and possibly some other limitations, so there is at least some support.
I think one question here is consistency between the place where LLVM decides what to put in that field, and what it uses elsewhere to decide the sizes of things. LLVM internally needs to use the same value for all the places that are “the size of an address on the target architecture” in the DWARF spec.
It’s definitely not consistent right now.
If the size of a code address and the size of a data address can be different, it’s probably best to use the larger as the “target architecture” size.
That’s exactly what I did and it does indeed work. Doing this would also restore the old behavior of the DWARF generation.
What I ended up doing was making AsmPrinter::getDwarfFormParams virtual and overriding it for our target, so I could put the proper value in FormParams::AddrSize. Does this seem like a reasonable solution? It keeps the existing (possibly wrong) behavior intact but targets can override it.
The other option would be to have an MCAsmInfo object available everywhere FormParams are created so that CodePointerSize could be used to fill in FormParams::AddrSize. This seems more correct to me but changes current behavior.
(actually looking at code for the first time)
getDwarfFormParams calls AsmPrinter::getPointerSize() which defers to TargetMachine::getPointerSize(0); which has a FIXME about relying on address space 0, and a comment about not relying on the Module’s DataLayout, because there might not be a Module. Although TargetMachine has a DataLayout.
Oh, but other places in DwarfDebug.cpp use getCodePointerSize(). Hmm. Those places are indeed all emitting code addresses, so it would naively seem like the right thing to do. Hmm hmm. Some of those places have their own address-size field, but not all of them; some depend on the address-size in the compile-unit header, and the compile-unit header’s value is what determines the sizes of certain other values.
This is an intriguingly tricky problem, and I congratulate you on tripping over it.
Let me spend some time surveying how different DWARF versions deal with address sizes, and hopefully we can come up with a path forward for you.
If we over-simplify the problem to “code addresses” and “data addresses” which might have different sizes, we can get a handle on the parts of DWARF that need one or the other or both, and when LLVM should use which size parameter.
Simplistically, we can take max(code address size, data address size) and just use it everywhere. We’d want a new API, probably on TargetMachine, to return that canonical/for-DWARF-purposes size, and make sure everywhere uses it. Making LLVM internally consistent is a definite benefit here; this is probably the minimum we should do.
However, if the two sizes are different, there are places where this creates a size cost, and maybe we don’t want to pay that cost. This gets a little finicky…
So, which sections of DWARF need which size? The exact set of sections, and whether they are independent of .debug_info, varies with the DWARF version. My survey discovered the following.
Section Needs
.debug_info max
.debug_loc (v2-v4) inherited from .debug_info
.debug_ranges (v2-v4) inherited from .debug_info
.debug_loclists (v5+) max (*)
.debug_rnglists (v5+) code
.debug_line (v2-v4) inherited from .debug_info; (v5+) code
.debug_frame (v2-v3) implicit; (v4+) code
.debug_aranges code
(*) could use code for ranges and data for expressions,
but there is no way to specify two address sizes in one header
So, if the code-address size is smaller than the data-address size, it could save some space in some sections for v5 onward, if we picked the right places to use that smaller size, and got it right. Looking at the survey results, the benefit intuitively seems not huge, although code addresses in .debug_line might start to add up. I’m willing to be convinced by data.
Oh, but other places in DwarfDebug.cpp use getCodePointerSize(). Hmm.
Yep, you got it. It’s a conundrum for sure. I am willing to create a patch implementing a new TargetMachine API as you describe and have it return the max by default. It sounds like eventually we want finer-grained APIs for the various different contexts in DWARF. That’s getting a bit away from my core job so not sure I could justify the time spent on it.
That would solve your problem and create better consistency within LLVM. Happy to review such a patch, although I’m about to be away for a week. The refinement to optimize when code addresses are smaller than data addresses can be done later if the size benefit seems to be worthwhile.