[RFC][Dwarf Library] Relocations for DWO sections

Hello

I observed when DWARF Context is created for DWO object (split dwarf single mode), that relocations for .debug_info are processed and are stored in a map. This adds quite a bit of memory overhead. This doesn’t seem like it is needed for DWO Context. Context created through API DWARFContext::getDWOContext. Am I missing something?

Illustrative patch to fix this:
https://reviews.llvm.org/D106624

Thank you,
Alex

General premise sounds correct to me (that we shouldn’t be processing those sections, etc). I’ve replied to the patch - thanks for taking a look at this!

(out of curiosity: What are you using Split DWARF single mode for (if you can speak to the application)?)

Thanks for replying in the patch. Left my reply.
We are using it to deal with dwarf relocation overflows. We considered DWARF64, but split dwarf seems like a more traveled path. As for single vs split my understanding is that single plays nicer with our build system ATM.

Thanks for replying in the patch. Left my reply.
We are using it to deal with dwarf relocation overflows.

Ah, that’s good to know. FWIW we’ve started to hit some overflows even in Split DWARF on larger binaries (and/or those making especially heavy use of expression templates - creating an exceptional amount of DWARF/long symbol names

A couple of ideas to address this particular overflow (which section(s) did you manage to overflow? We’re dealing with .debug_str[.dwo] overflow in particular) that I’m looking into are:
Simplified template names ( https://lists.llvm.org/pipermail/llvm-dev/2021-June/150903.html ) - emit only the base name (“foo”) of a template rather than all the template parameters (“foo”) - and then reconstruct the full name by using the DW_TAG_template_type_parameters, etc.
Reconstituted Mangled names ( https://groups.google.com/g/llvm-dev/c/2jMqDjdChuQ/m/HpOpWy8pAwAJ ) - skip mangled names when they can be reconstituted from the DWARF structural representation (eg: “void f1(int) { }” → “_Z2f1i” but we could build the latter from DWARF’s representation that says f1 has one “int” parameter).

We considered DWARF64, but split dwarf seems like a more traveled path. As for single vs split my understanding is that single plays nicer with our build system ATM.

Ah, fair enough.

Haven’t seen overflows in Split DWARF yet, but thanks for letting me know, and the links to discussions. Is there a plan to productize either one or both?

For us, in monolithic format, it was .debug_info that was growing too large and relocations failing in to, or out of it. The.debug_aranges relocations in to it, and don’t quite remember from top of my head what out relocation was in to. I think it was .debug_loc

Alex

Haven’t seen overflows in Split DWARF yet

Careful, as they’re totally silent (at least with gold dwp, and probably also with llvm dwp) - the str_offsets get overflowed values, and then when the data is read by the DWARF consumer, the strings end up corrupted - because you’re reading from arbitrary/incorrect offsets.

, but thanks for letting me know, and the links to discussions. Is there a plan to productize either one or both?

Yep, the plan on both counts is to upstream them. I have the simplified template names implementation on the go at the moment - adding a flag to clang that implements the functionality, but also implements a “mangled” mode, where if a name should eb able to be simplified instead it’s emitted in full with a special prefix (“_STN”) - and then the consumer can attempt to reconstitute that name and compare it against the name provided (& the llvm-dwarfdump --verify mode does this checking and fails if they don’t match). So I’m going through lots of cases, either adding the rebuilding logic that’s needed, or modifying the frontend not to simplify/mark certain names that can’t be rebuilt.

For us, in monolithic format, it was .debug_info that was growing too large and relocations failing in to, or out of it. The.debug_aranges relocations in to it, and don’t quite remember from top of my head what out relocation was in to. I think it was .debug_loc

Huh, fascinating. Good to know!

  • Dave

Somewhat different topic.
Have you seen multiple Skelton CUs having exact same DWO ID, and point to two different .dwo files with exact same debug information? This is produced by ThinLTO. My understanding is that this is not legal.

P.S. I updated the patch, https://reviews.llvm.org/D106624, with all suggestions. I kept it as one patch for now, until all parts are in. Then can break it up.

Somewhat different topic.
Have you seen multiple Skelton CUs having exact same DWO ID, and point to two different .dwo files with exact same debug information? This is produced by ThinLTO. My understanding is that this is not legal.

Yeah, we’ve seen that a few times. I have yet to see it due to a compiler bug - so far due to “interesting” builds. The most recent was a case of using “gmlt”/g1+split DWARF - but where two files differ only by global variables (since global variables don’t have any DWARF emission under -g1, the only thing remaining was a static function with the same name in both files, so the hashes were identical.

I’ve not decided what to do about that yet (it’s not high on my list to think about, but rolling around there from time to time) - maybe to just allow this/no longer warn in the dwp tool if there are these cases where it’s reasonable/correct to have duplicate units…

There were some other cases I came across years ago (both old and new cases were in ffmpeg, FWIW - they build the same files multiple times with different defines (to emit wide and narrow versions of the interface, I think?) ) so it’s easier for the split unit/hash to become identical due to that idiom) I think did involve fixes to ffmpeg’s build to not build empty files when certain features weren’t enabled that made the files basically empty (& thus their split units identical)…

Partly the issue is that the dwp tool doesn’t have the same linker behavior as the real linker - it doesn’t discard things where the real linker discards things because they’re unreferenced. Arguably that build could be fixed to avoid the duplication or empty object files (eg: object file containing only a global ctor - if that’s in a library, it’ll never actually be used/linked in))

P.S. I updated the patch, https://reviews.llvm.org/D106624, with all suggestions. I kept it as one patch for now, until all parts are in. Then can break it up.

Yeah, I’m looking at it, but getting myself a bit muddled up. See if I can get my head on straight.

Ah I see. Thanks for explanation. For me it’s in context of BOLT patching debug information of split dwarf units. Let me see if this is one of the cases you have mentioned.

Thank You
Alex