[RFC] Heterogeneous Debug Info

I gave a talk at the 2022 LLVM Dev Conference titled “Heterogeneous Debug Metadata in LLVM” and as promised I’m following up with this RFC.

The name is a bit misleading, in that this proposed approach is intended to replace aspects of the existing LLVM debug metadata, not just supplement it for certain targets. This is aimed at improving debug support for all targets, not just AMDGPU!

The context and details of the proposed changes are already described in a document open for review at :gear: D138869 [Docs][RFC] Add AMDGPU LLVM Extensions for Heterogeneous Debugging. To avoid duplication, I will not expand on the details here. However, feel free to reply in either that Phabricator review or this Discourse topic; I will monitor both and keep both up-to-date.

I will also link to the recorded talk here once it is available and will begin uploading patches implementing aspects of this proposal in the coming days, to give everyone some concrete code to play around with.

I would greatly appreciate any and all feedback on the proposed changes, and I hope these ideas can make LLVM debug information even better!

1 Like

Hi Scott, looks like really intriguing stuff. As mentioned on the review, it’s really too big a lump to digest and review all in one go. But the lifetime and fragment stuff from the talk looks hugely beneficial, just from a quick skim, and I’m looking forward to improvements in the optimized debugging experience as a result of this work!

I’d like to echo Paul’s comment. I think many of the goals of the proposal are great but there are too many moving things at one for this to be properly reviewed. Can you break up the proposal into smaller (ideally individually useful) proposals and perhaps explain where they fit into the big picture if necessary?

Hi Scott, I just noticed this topic existed so figured I’d reply here,

One of the additional things enabled by these proposals is the ability to write to variables that have been subject to optimisation. In the past I’ve always assumed that having that ability wasn’t especially important to debug-info consumers: for optimised code, writing into registers or memory to set the value of a variable might have unpredictable effects on other variables and unexpectedly affect the operation of the function. A potentially large number of other variables could have been merged with the one being set. As far as I’m aware, writing to optimised variables isn’t a feature that developers are currently requesting.

Is this a use case that other debug-info consumers out there are interested in? There’s a risk that we put effort into supporting this when it isn’t widely needed.

Obviously writing to variables is a useful feature for unoptimised code, as far as I’m aware LLVMs debug-info production at O0 works fine as it is.

I can try to source some more concrete requests for this at AMD, but in the meantime I would just argue that what “unoptimized” means varies between targets, and for AMDGPU we still need to do some work even at -O0 to get to workable code. In some embedded contexts there are also constraints like limited space for the code image or minimum performance guarantees required for realtime processing.

If writing variables is a useful feature with unoptimized code, I am not sure I see how that value dissipates at higher optimization levels. So long as the compiler can describe the constraints accurately, why shouldn’t it do so?

If the size of the resulting debug info is a concern, we can always support more levels of “reduced debug info” which reflect the current behavior.

If maintenance of the code in LLVM is a concern, I would argue the approach is much simpler in many respects and would actually result in a net improvement.

I’m also working to reconcile our proposed changes with the other improvements you pointed out are in-flight already, and I suspect there may be very little cost to supporting variable writing generally. For example, with the AssignID work we will already be implicitly tracking multiple locations; rather than flattening that to one “preferred” location after ISel it seems like we could just as easily determine the set of all valid locations, and determine whether it is read-only or read-write.

Also, the recorded talk went up at 2022 LLVM Dev Mtg: Heterogeneous Debug Metadata in LLVM - YouTube :slightly_smiling_face:

[Long Christmas-based break sorry],

I agree that being able to write to variable locations is worthwhile; it’s more a matter of prioritisation though, in that I think there’s less motivation to restructure metadata to write to locations if the benefits are small. That being said, if there are mandatory optimisations for AMDGPU that are getting in the way of debuggability at -O0 then that’s great motivation for improvements to support those optimisations.

One of the other areas of interest here is using def/kill intrinsics for modelling the lifetime-ranges of variables – are these intended to completely (or largely) replace dbg.value intrinsics? I think separating the lifetime of an assignment (or definition, my language is fuzzy here sorry) to a variable from the value of that assignment/definition could get us a lot of benefits as there can be many ways of expressing the same Value. However, it’s not obvious to me that the well-formedness rules can survive in the presence of optimisations to the CFG. Passes like jump threading can clone or move blocks (containing defs and/or kills) and place them in a parallel code path, and branch folding / tail duplication can produce additional paths through those intrinsics. If there needs to strong guarantees about dbg.defs dominating dbg.kills for example, this is liable to be broken during optimisations.

One of the benefits of dbg.value intrinsics being coupled with the lifetime of variable assignments is that when blocks get shuffled around during optimisations, the dominance frontier of the source variables value doesn’t need maintenence – the last dominating dbg.value gives the variables value, and LiveDebugValues can often find locations for PHIs if there’s no dominating dbg.value. If we switched to manually tracking the lifetime of a variables definition, wouldn’t we be forced to potentially perform debug-info maintenence whenever an instruction moves? This risks a lot of extra compile-time burden.

Sorry for the delay in responding, I’ve been off on some other things and still need to get back to understanding how we might line up with something like the assignment-based intrinsic approach.

It may well be that the approach we took will require too much work up-front during optimization. We tried to account for this somewhat by structuring things such that changes are very localized (i.e. the DIFragment approach means one should only ever need to concern themselves with the “first level” of debug information based on an LLVM Value) but we have only actually implemented our approach for -O0 and wanted to avoid working in a silo for longer (specifically so other eyes can find these kinds of issues before too much work goes into a dead-end :slight_smile: ).

I think at the core which we are more confident has strong benefits is the changes to the expression language which eliminate ambiguity and workarounds during compilation, and provide a clean programmatic interface with room for playing with space and time tradeoffs. If LLVM can talk in terms of only locations up until lowering to a debug format then a lot of room for bugs related to idiosyncrasies in DWARF just evaporate. It also gives us a clean way to carry pointer address-space information through type information, which is one of the core requirements we actually started this work with.

Again, sorry for the delays on my side, I want to get my focus back on addressing the concerns that have been raised and getting code up to review, but have been sidetracked for a bit.