[RFC][DebugInfo] Proposed changes to the textual IR representation for debug values

StephenTozer · September 15, 2023, 9:49pm

tl;dr following from previous discussions about replacing debug intrinsics with non-instruction equivalents, we are seeking input on what this should look like in IR.

Background

We (Sony) are currently working on a prototype that changes the representation of debug values to no longer use instructions. Previous discussions have been raised on Discourse[0][1] and at LLVM conferences[2] on this topic; this RFC discusses the textual IR design for this model. To briefly summarize our goals in removing debug intrinsics, we are aiming to improve performance by reducing the number of instructions, reduce the amount of pointer chasing when searching for non-debug instructions, and reducing the surface area for -g to affect CodeGen. Since removing debug intrinsics involves making a substantial change to the developer-facing output of the compiler, I expect there to be many concerns regarding the technical limitations of the design, and many further concerns about its legibility, ease-of-editing, and other ergonomic issues; the design presented here is not final, and is intended to start a discussion about what this new debug info should look like, establish what different developers/vendors want from a new design, and hopefully reach a general consensus on the next steps. So without further ado:

Design

Here is an example IR function in current LLVM:

define dso_local i32 @f(i32 %a) !dbg !7 {
entry:
  call void @llvm.dbg.value(metadata i32 %a, metadata !10, metadata !DIExpression()), !dbg !20
  %b = alloca i32, !dbg !20, !DIAssignID !30
  call void @llvm.dbg.declare(metadata i32 %b, metadata !11, metadata !DIExpression()), !dbg !21
  %add = add i32 %a, 5, !dbg !21
  call void @llvm.dbg.value(metadata i32 %add, metadata !10, metadata !DIExpression()), !dbg !20
  call void @llvm.dbg.assign(metadata i32 %add, metadata !12, metadata !DIExpression(), metadata !30, metadata ptr %b, metadata !DIExpression()), !dbg !22
  store i32 %add, ptr %b, !dbg !22
  ret i32 %add, !dbg !23

}

And here is the same function with our proposed change:

define dso_local i32 @f(i32 %a) !dbg !7 {
entry:
    #dbgrecord value { i32 %a, !10, !DIExpression(), !20 }
  %b = alloca i32, !dbg !20, !DIAssignID !30
    #dbgrecord declare { i32 %b, !11, !DIExpression(), !21 }
  %add = add i32 %a, 5, !dbg !21
    #dbgrecord value { i32 %add, !10, !DIExpression(), !20 }
    #dbgrecord assign { i32 %add, !12, !DIExpression(), !30, ptr %b, !DIExpression(), !22 }
  store i32 %add, ptr %b, !dbg !22
  ret i32 %add, !dbg !23

}

In this design, debug variable intrinsics are replaced by debug records, which have a different syntax but identical meaning. Debug records are printed in the position at which they become live, as with current intrinsics, but with extra indentation and a different syntax to clearly distinguish them from instructions: each begins with #dbgrecord, and then a type corresponding to the equivalent debug variable intrinsic, e.g. “value” for “llvm.dbg.value”. The contents of the curly braces are the same as the arguments to a debug intrinsic, except for the absence of the metadata prefix and including as an extra argument the DILocation metadata that would normally be attached to the intrinsic. Besides those differences, nothing else fundamentally changes about how to read, write, or edit debug info from the existing design.

Design Goals

One of the main aims of this design is to make sure that debug records don’t look like instructions, so that the IR faithfully represents the API to minimize surprising behaviours. For example, if a developer is debugging an opt pass and they print the IR, it should be obvious to them that if they have a pointer to %inst1, calling next() on that pointer should yield a pointer to %inst2:

%inst1 = op1 %a, %b
  #dbgrecord value { %inst1, !10, !DIExpression(), !11 }
%inst2 = op2 %inst1, %c

If instead we used a syntax comparable to an instruction, or even just converted debug records back to instructions for IR, this becomes less clear:

%inst1 = op1 %a, %b
llvm.dbg.value(%inst1, !10, !DIExpression(), !11)
%inst2 = op2 %inst1, %c

In summary, debug records are not instructions, and so it should be made apparent even to developers unfamiliar with IR that they are a distinct language element. Aside from that change, this format retains most of the benefits of the current model, in that it is easy to read debug info, to see when a debug record becomes live, to manually edit, move, add, or delete debug values, and as an additional benefit the indentation should make it easier to skim over debug info when it isn’t of interest to the developer or to skim over regular instructions in the opposite case.

There is one disadvantage to this model however, which is that it does not make obvious the API relationship between instructions and debug values: in the instruction API, debug values are a child of the following instruction (i.e. the instruction at which its live range begins); in this representation, debug values only appear between instructions, so the relationship is unclear or actually misleading since the indent would normally imply that the debug value is a child of the instruction above it. Moving debug values outside of the instruction list altogether would resolve this issue, but would complicate human reading and editing of IR too much; good API design and documentation should be enough to clear up any developer confusion.

Next Steps

The prototype implementation for removing debug intrinsics is still in progress, but some patches that move us towards this goal have already landed, and more will be submitted with time. The textual IR representation for non-instruction debug values is likely the most visible developer-facing change and so demands a healthy amount of input before finalizing a design. All suggestions, comments, and critiques are welcome and desired, but two questions in particular that would be useful to know are: Is this design clear and legible to you? And also, does the design have any apparent limitations (whether present in current LLVM or not) that would interfere with the way you use LLVM?

A patch for LLVM to test out this change will be available soon.

[0] Prototyping a not-an-instruction dbg.value
[1] [RFC] Instruction API changes needed to eliminate debug intrinsics from IR
[2] Debug-info round table notes

jryans · September 18, 2023, 2:09pm

Overall, I think this seems like a good way to represent the new structure in textual IR.

The use of distinct indentation is a nice way to visually isolate debug info.

As a small optimisation, I think you could go for the shorter #dbg prefix instead of #dbgrecord. The word “record” here feels a bit redundant. You already allow for a second parameter that defines the “type” of each debug thing (currently value, declare, assign), and this should presumably be enough to distinguish future types of debug things we might cook up down the road.

rnk · September 20, 2023, 8:20pm

Thanks for working on this! I’m basically very supportive.

I like the indentation as a way to interleave non-instruction data into the instruction stream.

I would prefer #dbg over #dbgrecord. It is close to @llvm.dbg.value → #dbg value in the end.

If we feel that #dbg is ambiguous with !dbg source location data on instructions, I’d actually suggest that we consider replacing !dbg with loc(), as a step towards MLIR convergence. Source locations in LLVM are after all referred to with DebugLoc.

By keeping the IR close to the existing rep, I wonder if we will be able to textually migrate (with sed) most of the LLVM test suite.

I considered the issue of trailing dbg.values, but I guess they don’t exist because we don’t allow these intrinsics after terminators.

StephenTozer · January 24, 2024, 12:29pm

Updating this thread to let it be known that the first patch for supporting this new IR has been opened for review, with the remainder to follow shortly. We’ve made a minor change to the format as suggested, changing #dbgrecord value -> #dbg_value, which reads better and avoids keyword clash - thanks for the feedback!

StephenTozer · January 29, 2024, 2:47pm

And furthermore, the patch for the parsing is open as well, with a documentation update soon to follow; this and the printing patch will be the two patches that define the format, so now’s the best time for any further input into the design!

StephenTozer · February 5, 2024, 3:48pm

A further update to the design, changing the use of curly braces to parentheses, meaning the design is now:

Old:   call void @llvm.dbg.value(metadata i32 %a, metadata !12, metadata !DIExpression()), !dbg !14
New:     #dbg_value(i32 %a, !12, !DIExpression(), !14)

StephenTozer · February 29, 2024, 2:30pm

The patches for the RemoveDIs textual IR representation are in the process of landing, but as part of the verifier changes, some differences between debug intrinsics and debug records have been raised for handling invalid code; specifically, some inputs that are currently verifier failures will become parser failures. The difference between parser and verifier errors are important primarily because a verifier error can be recovered from by dropping debug info for the module, while a parser error is always unrecoverable.

The points of failure for debug intrinsics and records can be broadly categorized as:

Syntax
Arity
Types
Semantics

Debug intrinsics are generally quite permissive in the parsing phase, willing to accept any number of parameters of any valid type, and the verifier will later enforce the correctness of the number, type, and semantic expectations of those parameters; only bad syntax is a parser error.

Debug records are stricter; the parser has strict expectations for the parameter count of each type of record, and all parameters must be MDNodes, with the exception of value/address parameters which can be any type of metadata. Initially these checks were even stricter, as DIExpressions were forced to be inline (since they never need to be forward-referenced), but this requirement has been relaxed. All other semantic errors, including ensuring that the parameters are the correct type of MDNode/metadata, are still verifier errors.

This means that there is existing IR that currently parses correctly and will have debug info dropped by the verifier, that if directly converted to the new debug info format will fail in the parsing stage. I don’t believe this should be an issue - we continue to support parsing old IR, and no compiler that produces RemoveDI-IR will produce output that fails in the parser; the only point of friction is in automatically-updated tests that intentionally contain invalid debug info; as the intent of these is likely to specifically test verifier behaviour, it should be acceptable to either drop these tests or change them to expect a parser error.

Hopefully this makes the changes clear - if anyone has any thoughts or complaints about this change then I’d appreciate hearing them!

Topic		Replies	Views
[RFC] Instruction API changes needed to eliminate debug intrinsics from IR IR & Optimizations debuginfo	20	2959	March 5, 2024
[RFC] Less memory and greater maintainability for debug info IR LLVM Dev List Archives	35	108	October 20, 2014
Proposal for multi location debug info support in LLVM IR LLVM Dev List Archives debuginfo	9	132	January 6, 2016
Call for testing -- non-instruction debug-info LLVM Project debuginfo	6	430	December 16, 2023
Request for feedback on debug info issues IR & Optimizations debuginfo	4	644	April 10, 2023

[RFC][DebugInfo] Proposed changes to the textual IR representation for debug values

Background

Design

Design Goals

Next Steps

Related Topics