Debug-info round table notes

Hi all,

Thanks to everyone who came to the debug-info round table, and/or the optimised debug-info round table. I’ve written up notes about the two discussions below for dissemination / discussion. For clarity, the former was for all topics related to debug-info, while the latter was specifically about how LLVM represents things and attempts to preserve information during optimisation.

Apologies that I didn’t get everyone’s names, I’ve filled in the names that I know.

Debug-info round table topics:

  • Identical code folding – how should the DWARF for merged functions be represented, and how should it be interpreted? The consensus was that these functions can’t usually be represented in DWARF, and that there’s a proposal for the standard to explicitly identify these functions with a reserved address value: DWARF Issue . Consumers can however attempt to disambiguate the current code location by walking up the call stack, and using call site information to determine the original intended function to be called, once a merged function is encountered. One suggestion was made that another DWARF standard proposal, adding “location views”, could be used to describe multiple source locations per instruction in functions that have been merged, however this was not explored in detail.
  • (Vince?) from MediaTek highlighted an issue where line numbers were dropped even at -O0. We didn’t explore this at the time, as no-one was familiar with it, but exploring it later it appears that architectures without fastisel will use SelectionDAG, which will deliberately drop the source location for constants materializations because they might be merged. IMO the problem is more that optimisations are happening at -O0, rather than a general debug-info issue.
  • Michael from Nintendo says .eh_frame has got bigger in recent releases, and asked if anyone else had observed an issue. While people agreed that the contents of .eh_frame were related to debug-info, no-one else had noticed a significant increase lately.
  • @dblaikie discussed the the use of the .debug_aranges section. It is (as I understand it) purely a different way of accessing information that already exists, which David argues needlessly wastes space. Ideally, no-one would consume this section and LLVM would not bother to produce it. A few people mentioned they’d pass this on to their debugger teams, it didn’t feel like anyone is in a great rush to eliminate uses of this section.
  • I then advertised llvm-debuginfo-analyzer a little – for those not familiar, it’s a tool (previously called ‘DVA’ or ‘DIVA’) for displaying the meaning of debugging information, rather than just the syntax of DWARF or CodeView. This allows developers to compare the difference in meaning between two binaries, without seeing any difference in how the information is encoded – for example between CodeView, DWARF, and other compilers (Examples one two three). There’s a stack of patches here that are under review: ⚙ D125776 [llvm-debuginfo-analyzer] 01 - Interval tree . There are some final outstanding issues to address, more review, especially of the CodeView patch, would be most welcome.
  • I brought up constructor homing for type information – this has delivered excellent reduction in debug-info size in various places, however there are some slightly iffy C++ situations where types are no longer generated, more detail here: Keeping vtable homing for debug-info type generation . I think everyone agreed that constructor homing was great, there was appetite for there being additional attributes (or similar) that makes it more useable for not-completely-conformant C++.
  • Someone asked: Any plans for further reduction of debug-info size? @dblaikie had a number of suggestions, for example simplified symbol template names [DWARF] using simplified template names , which gdb and lldb are known to be OK loading. I believe development of this continues. Other suggestions include homing types during ThinLTO to achieve the space savings of type units, without the linker overhead. LLVM also recently gained support for zstd compression of DWARF (98% sure I heard it right, that it exists now, not “soon”).
  • Finally I demonstrated the output of a Dexter test suite that we have in-house, where some five hundred variables in a large C++ code base have been instrumented and are regularly tested to see how accurate they are. The overall theme being that variable location coverage increased a lot from llvm-8 to llvm-10, including more incorrect locations, and there have been smaller improvements since.

On to the the optimised debug-info round table – more people turned up for this than I expected, probably because I advertised that it was about eliminating dbg.value instructions. While eliminating those instructions doesn’t have an effect on the debugging experience of developers, it does alter what information we can represent in the compiler, and has a large effect on how variable locations are tracked in the presence of optimisations, hence the relevance.

I worked through an example piece of LLVM-IR and what difficulties we would face if we moved dbg.value intrinsics to be somewhere else, see below, illustrating how debugging information might be stored differently (not the final design, and only to demonstrate the relationships between elements).

InstList                     DbgValueList
==============================================
   v              ----------> dbg.value(%arg0,
add %1, %2 ------/               v
   v                          dbg.value(%3,
   v                 -------> dbg.value(%4,
sub %3, %4 ---------/            v
   v                     ---> dbg.value(%5,
ret 0 ------------------/

All of the problems come from the fact that if the variable information does not have an instruction iterator, and thus its own position in the block, some operations become ambiguous or poorly defined. For example:

  • If some code calls Instruction::moveBefore or moveAfter on the sub above, should the attached variable location information travel with the moved instruction, or be left behind?
  • If I try to splice one block into the start of another block, should the new instructions go before or after the first dbg.value that comes before the add instruction at the start of the block?
  • If the terminator is deleted as an intermediate step of optimisation, and more instructions are inserted at the end of the block, do variable assignments at the end of the block come before or after those new instructions?

Whenever we have iterators for dbg.value intrinsics, these questions don’t come up, or at least aren’t ambiguous:

  • Code that wishes to move “all” the contents of a block will call moveBefore or moveAfter on the dbg.value intrinsics, as it steps through the contents of InstList,
  • getFirstInsertionPt will return the iterator for the first dbg.value if the caller intends on inserting at the start of the block,
  • Instructions inserted after the last instruction in a block by “moveAfter” would come before dbg.values, while inserting at end() would come after, because the dbg.values have their own position.

My solution to these problems would be changing the instruction movement / insertion API to have callers express the intention behind what they’re doing, a “disposition”. Exactly what needs to be expressed is still unclear, but for example moveBefore would want to know “Does this movement intended to alter the order in which instructions execute”, as a proxy for whether the variable information should move. I haven’t nailed down exactly what changes should be made yet.

The question I asked to everyone at the round table was “does this sound feasible”, “are you sympathetic with this kind of direction”. After this there was unstructured discussion, I’ll write those as bullet points with question / responses.

  • (Didn’t see who) said people are generally sympathetic to making debug-info better, but don’t know enough about it to improve it in their area of the compiler.
    • In my opinion, encoding information about movement dispositions should achieve that
  • We should have good and reasonable defaults.
    • This is something I moderately disagree with (joined by @rnk) – if we’re going to consider debug-info as something very important, and it’s possible for the default behaviour to harm debug-info, then people should be obliged to consider the debug-info behaviour when they modify the IR. Obviously, making this not-too-invasive is a matter of art.
  • @arsenm points out there are other ways that debug-info can interfere with optimisations.
    • The specific example of debug-info appearing in Value Use lists should be avoidable by using ValueAsMetadata, but it’s something that will need consideration.
  • Jerome (didn’t get your surname sorry) asked about the performance of the prototype I’ve written, which goes a couple of percent slower.
    • I think everyone’s assumption is that LLVM will surely go faster with less pointer chasing, however that’s not the objective of the prototype we’ve worked on. Certainly it’s something to monitor.
  • @rnk pointed out that a lot of the performance benefits can be achieved immediately by combining hundred-long runs of dbg.values into a mega.dbg.value intrinsic or something,
  • I think we all agreed on how a new design might be represented textually is a bike-shed topic to avoided.

I don’t think there were any concrete conclusions from this, but I found it very helpful to hear the kinds of topics that concern people. My plan at the moment is to come up with a mock design for what the required API changes might look like. I believe that this is very much a case of just “software engineering” and changing how some facts are expressed in the code, rather than actual “compiler engineering”. More news on this later.

Micheal from Nintento took a moment to ask about whether global-isel for aarch64 produces better or worse debug-info than SelectionDAG. I don’t think there were any strong opinions on whether one was objectively better or worse than the other right now.

Discussion then moved onto how we can use is_stmt better – context is this thread [DebugInfo] An idea for determining source-location orders after optimisation (aka: plumbing for is_stmt) . I presented a scenario where we drop source locations un-necessarily today and how we might avoid doing that in the future by using DWARF’s is_stmt flag, to distinguish between attribution of instructions-to-source-line and the stepping behaviour of the unoptimised program. However this wasn’t well understood (and/or presented), so we didn’t make much progress.

There were however a lot of fruitful discussions about

  • (the question of) Exactly what the consumer expectations are when it comes to source locations? Profiling is another special case,
  • The fact that you can’t perfectly represent the source program in the optimised anyway,
  • Whether existing consumers like lldb and gdb interpret is_stmt in a meaningful way, or is some new way of communicating information needed.

None of which I really wrote any notes about, sorry. Either way, I think we all continue to understand that there are limitations to LLVMs source location coverage at the moment that could be improved, but the path to do that isn’t clear.


Thanks,
Jeremy

4 Likes

Thanks for moderating/note-taking/etc!

One suggestion was made that another DWARF standard proposal, adding “location views”, could be used to describe multiple source locations per instruction in functions that have been merged, however this was not explored in detail.

My understanding is that GCC’s location views are for describing locations without instructions (eg: x = 1; x = 2; func(); you want to step and observe the x == 1 state even though it’s been eliminated as a dead store, before observing the x == 2 state - so there’s an artificial stop point inserted that doesn’t change the instruction/program counter, but changes the state the debugger uses to reconstitute types/points to a different line of source code), rather than instructions with multiple locations, so I don’t /believe/ it’s applicable here, but possibly I’ve misunderstood it and there’s more/different power available.

I think if people are interested in getting more source location information in cases of ambiguity (multiple locations for a single instruction), hoisting/sinking (locations moving across a BB boundary), etc - they really should get involved in the DWARF committee, at least peripherally - check out Cary Coutant’s Two Level Line Table proposal ( TwoLevelLineTables - wiki.dwarfstd.org ) and see if that addresses the needs, and if it doesn’t, provide feedback on how it could be modified to do so - it’s the current big line table potential project/major rework and it’d be good to factor in these use cases into that work.

LLVM also recently gained support for zstd compression of DWARF (98% sure I heard it right, that it exists now, not “soon”).

Correct, already available under -gz=zstd I believe.

I later realized my confusion is the representation seems to be different in codegen. Debug values do appear on the use lists of virtual registers, if not for IR values

DWARF Issue is the location view numbering proposal. I agree with your interpretation, it doesn’t seem like this proposal is going to help with handling instructions that have multiple locations.