[RFC] Allowing debug intrinsics to reference multiple SSA Values

Currently, the debug intrinsic functions each have 3 arguments: an SSA value representing either the address or Value of a local variable, a DILocalVariable, and a complex expression. If the SSA value is an Instruction, and that Instruction is at some point deleted, we attempt to salvage the SSA value by recreating the instruction within the complex expression. If the instruction cannot be replicated by a complex expression, then the variable location is dropped causing a reduction in coverage. One of the key restrictions on this process at the moment is that each intrinsic can only reference a single SSA value; a numeric constant may be represented within the expression itself, allowing for binary operators with a constant operand to be salvaged:

%c = add i32 %a, 4

llvm.dbg.value(metadata i32 %c, DILocalVariable(“x”), DIExpression())

; Salvage…

llvm.dbg.value(metadata i32 %a, DILocalVariable(“x”), DIExpression(DW_OP_constu, 4, DW_OP_plus))

This proposal is to allow multiple SSA value references within a debug intrinsic, allowing binary operators with non-constant operands to be salvaged. This is a long-awaited feature, with an open bugzilla[0] and support from members of the community[1][2]. To implement this change, each debug intrinsic will contain a list of SSA values instead of just one, and a new operator will be added: DW_OP_LLVM_register, which takes as its only argument the index of an SSA value within the intrinsic’s list, and pushes that SSA value onto the expression stack. Two proposed syntaxes for the list of SSA values - though suitable alternatives may be worth considering - are to either replace the first argument of the intrinsic function with an MDNode containing the SSA values as operands, or to remove the first argument and make the intrinsic function variadic, passing the SSA value list as vargs:

%c = add i32 %a, %b

llvm.dbg.value(metadata i32 %c, DILocalVariable(“x”), DIExpression())

; Salvage…

llvm.dbg.value(!{metadata i32 %a, metadata i32 %b}, DILocalVariable(“x”), DIExpression(DW_OP_LLVM_register, 0, DW_OP_LLVM_register, 1, DW_OP_plus))

; Alternatively, the intrinsic function could be made variadic…

llvm.dbg.value(DILocalVariable(“x”), DIExpression(DW_OP_LLVM_register, 0, DW_OP_LLVM_register, 1, DW_OP_plus), metadata i32 %a, metadata i32 %b)

The new operator DW_OP_LLVM_register would exist in the IR and MIR, and further down the pipeline be replaced by the appropriate operator for the target debug output. For example, when producing DWARF this would be replaced by DW_OP_regval_type, which pushes the contents of a given register interpreted as a value of a given type onto the DWARF expression stack.

This has the potential to allow salvaging in a much greater number of cases than is currently possible. There are also potential follow-up tasks, such as allowing the salvaging of conditional values, that would further improve debug variable availability. The following table gives, for several of the multi-source application projects in the test suite, the number of successful invocations of SalvageDebugInfo, and the number of failed salvages for each type of unsalvageable instruction:

Success Variadic Binops Variadic GEPs Cmp Insts Select Insts Load Insts Phi Nodes Alloca Insts Call Insts

ALAC 261 29 61 0 0 1 12 0 0

Burg 50 1 9 0 1 95 6 0 0

hbd 514 16 1 0 3 45 10 0 4

Lua 270 12 54 0 12 46 32 1 0

minisat 458 10 10 3 1 35 4 0 0

sgefa 439 1 121 0 20 14 55 0 0

SIBsim4 153 15 6 0 3 40 3 0 0

siod 112 2 1 0 0 11 5 2 1

SPASS 1241 70 27 27 27 2114 156 0 7

spiff 39 0 15 0 0 7 2 1 0

sqlite3 2322 94 167 6 37 1143 136 4 10

treecc 127 1 0 0 1 350 37 0 1

viterbi 7 0 1 0 1 1 0 0 0

Of these categories, the first 3 will become salvageable after this work is implemented, and Select Insts will also be salvageable with some follow-up work to enable conditional branching in complex expressions. The remainder are not salvageable in general, although it’s possible that the specific passes that delete those instructions may be able to preserve the debug info as long as the code isn’t totally dead.

[0] 39141 – Extend llvm.dbg.value to take more than one LLVM SSA value.

[1] ⚙ D51976 [DebugInfo][Dexter] Speculated BB presents illegal variable value to debugger

[2] [llvm-dev] DW_OP_implicit_pointer design/implementation in general

Great! This seems like a crucial feature for building expressions to represent induction variables, which I am told is an important long standing problem. From your proposed solutions, I like the idea of making dbg.value variadic.

(+usual debug info folks)

I’m mostly staying out of discussions around optimized debug info/variable locations - other folks have more state on this than I do. But some casual thoughts from the peanut gallery…

Currently, the debug intrinsic functions each have 3 arguments: an SSA value representing either the address or Value of a local variable, a DILocalVariable, and a complex expression. If the SSA value is an Instruction, and that Instruction is at some point deleted, we attempt to salvage the SSA value by recreating the instruction within the complex expression. If the instruction cannot be replicated by a complex expression, then the variable location is dropped causing a reduction in coverage. One of the key restrictions on this process at the moment is that each intrinsic can only reference a single SSA value; a numeric constant may be represented within the expression itself, allowing for binary operators with a constant operand to be salvaged:

%c = add i32 %a, 4

llvm.dbg.value(metadata i32 %c, DILocalVariable(“x”), DIExpression())

; Salvage…

llvm.dbg.value(metadata i32 %a, DILocalVariable(“x”), DIExpression(DW_OP_constu, 4, DW_OP_plus))

This proposal is to allow multiple SSA value references within a debug intrinsic, allowing binary operators with non-constant operands to be salvaged. This is a long-awaited feature, with an open bugzilla[0] and support from members of the community[1][2]. To implement this change, each debug intrinsic will contain a list of SSA values instead of just one, and a new operator will be added: DW_OP_LLVM_register, which takes as its only argument the index of an SSA value within the intrinsic’s list, and pushes that SSA value onto the expression stack.

What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)?

If it’s decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support.

What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)?

If it’s decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support

Putting the register values initially on the stack reduces the verbosity, though it could complicate successive salvages of variadic DIExpressions - if any value other than the last needs salvaging, then you have to use DWARF stack operations to move it to the top of the stack. For example, if the elements are pushed in order so that the last element is on the top of the stack:

%c = mul 3, %a

%d = add 5, %b

dbg.value(!DILocalVariable(“x”), !DIExpression(DW_OP_plus), %c, %d)

; Salvage %d

dbg.value(!DILocalVariable(“x”), !DIExpression(DW_OP_plus_constu, 5, DW_OP_plus), %c, %b)

; Salvage %c needs to use DW_OP_swap

dbg.value(!DILocalVariable(“x”), !DIExpression(DW_OP_plus_constu, 5, DW_OP_swap, DW_OP_constu, 3, DW_OP_mul, DW_OP_plus), %a, %b)

Or, written out as the stack state:

[%a, %b] ; Initial state

[%a, %b + 5] ;DW_OP_plus_constu, 5
[%b + 5, %a] ;DW_OP_swap
[%b + 5, %a, 3] ;DW_OP_constu, 3
[%b + 5, %a * 3] ; DW_OP_mul
[(%b + 5) + (%a * 3)] ; DW_OP_plus

The simplest and most general solution would be to use DW_OP_pick, which duplicates the stack element at a given index to the top of the stack. This is more or less the same as using DW_OP_LLVM_register - both of them take an index corresponding to a register value - with a few differences: we get to maintain the default concise dbg.value with a single argument and empty DIExpression, but salvaging becomes more brittle. If we rely on elements being on the stack in their declared order, then we need to guarantee that nothing modifies those stack elements. The cleanest implementation of this would be for SalvageDebugInfo to salvage expressions normally when the last register is salvaged, and switch to using DW_OP_pick for every register whenever any other register is salvaged:

%c = add %a, 5

%e = div %c, %d

; x = %b * ((%a + 5) / %d)

dbg.value(!DILocalVariable(“x”), !DIExpression(DW_OP_mul), %b, %e)
; Salvage %e; last operator so salvage normally

dbg.value(!DILocalVariable(“x”), !DIExpression(DW_OP_mul, DW_OP_div), %b, %c, %d)
; Salvage %c; not last operator and expression isn’t already using pick, so add DW_OP_pick for registers

dbg.value(!DILocalVariable(“x”), !DIExpression(DW_OP_pick, 0, DW_OP_pick, 1, DW_OP_plus_constu, 5, DW_OP_pick, 2, DW_OP_div, DW_OP_mul), %b, %a, %d)

The major advantage of putting all registers on the stack is that it reduces verbosity and also doesn’t require us to implement the new operator or update existing DIExpressions to contain it, which is useful. Personally I lean towards keeping things simple and consistent, and so using the pick/register operator in every expression rather than only the ones that need it, but there’s a good case for either I think.

Hi Stephen,

This makes sense to me.

The Salvage Debug Info functionality is missing the chance to salvage a lot of cases where the optimized-out instructions manipulate on multiple SSA values. Currently, it supports only expressions with a single SSA value and a constant.
This could have good impact on debug optimized code and debug location coverage.

DW_OP_pick should be something like DW_OP_LLVM_pick, but I think it is an implementation detail, so it might be too early for such comment.

Best,
Djordje

As the person who has advocated for DW_OP_LLVM_arg(N) before, my main motivation was to resolve the ambiguity of constant DIExpressions: As a worst-case example:

dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42))

Is this undefined, or constant 42?

But if we make dbg.value fully variadic with all parameters pushed to the stack ahead of time, we can distinguish between

dbg.value(!DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; a constant
dbg.value(i32 42, !DILocalVariable(x), DIExpression()) ; the same constant
dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; undef + garbage leftover expression

If we do it this way and allow 0-n SSA values then I'm in support of the push-first scheme.

As a side note, updating dbg.values in salvageDebugInfo when we allow multiple SSA values per dbg.value will be interesting, but possible. We'll need to put the value we need to update at the top of the stack, prepend the salvage expression and then restore the order of arguments on the stack that the rest of the expression expects.

-- adrian

This ambiguity could also be resolved without the “push everything” approach, though, right?

If the format supported zero operands - then the expression would be evaluated with an empty stack (rather than the implicit first element stack - since there’s no element). It’d make the “treat an operand list of length 1 specially, by pushing it onto the stack before evaluating the DWARF expression” look even more special/less tidy, but not totally outside the realm of possibility.

(I don’t have super strong feelings here - I like consistency, so I like the idea of pushing all the operands or using an extension op to access them all - but I can see the value in special casing the 1 operand case to maintain the behavior/syntax it has today)

As the person who has advocated for DW_OP_LLVM_arg(N) before, my main motivation was to resolve the ambiguity of constant DIExpressions: As a worst-case example:

dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42))

Is this undefined, or constant 42?

But if we make dbg.value fully variadic with all parameters pushed to the stack ahead of time, we can distinguish between

dbg.value(!DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; a constant
dbg.value(i32 42, !DILocalVariable(x), DIExpression()) ; the same constant dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; undef + garbage leftover expression

If we do it this way and allow 0-n SSA values then I'm in support of the push-first scheme.

This seems like as good a case as any to go with the push-first approach, since it solves a meaningful problem that the alternative does not.

As a side note, updating dbg.values in salvageDebugInfo when we allow multiple SSA values per dbg.value will be interesting, but possible. We'll need to put the value we need to update at the top of the stack, prepend the salvage expression and then restore the order of arguments on the stack that the rest of the expression expects.

The difficulty here is that in DWARF5, there is no operator that can move an argument up the stack; there are specific operators that rotate the top 2 or 3 elements, but outside of that the only thing you can do is copy an element to the top with DW_OP_pick. Of course we don't have to stick to actual DWARF operators by any means, but I don't think there's any way to eventually express an indexed swap/move operator in DWARF either: you can duplicate an element from anywhere on the stack to the top, but you can't delete the original element or move the top element back down again. Because of this, it is necessary (I believe) to use DW_OP_pick in the general case even if all the arguments are initially pushed onto the stack, due to the limits on how we can manipulate the stack. Of course there are many expressions that don't require DW_OP_pick to evaluate, but for consistency's sake I think it's best to use it instead of the swap or rotate operators whenever any of them is needed.

Regardless of whether we push everything immediately or not, having an operator that pushes the SSA value at a given argument index onto the stack seems necessary - if we push elements initially then DW_OP_pick (or DW_OP_LLVM_pick), otherwise DW_OP_LLVM_register. In either of these cases, salvaging is simple: we simply append the salvage expression after the salvaged value's operator. It also translates cleanly to DWARF, since we can directly replace the operator in question with DW_OP_regval_type, using whichever register holds the value we're looking for as an argument, without needing to modify or even consider the rest of the expression at all.

When we have <= 3 arguments (I don't want to call them registers because they may turn into constants or memory locations later on) we can use DW_OP_rot to move the one we need to adjust to the top and the DW_OP_rot the stack back into the original order. For 4-255 arguments, we need to add extra operations both to the beginning and the end of the expression, here an example for salvaging the 4th element from the top:

r0 r1 r2 r3 ; stack before
DW_OP_pick 0

r0 r1 r2 r3 r0 ; stack after pick 0 ...
<adjustments> ; the salvage operations for r0

r0 r1 r2 r3 r0'
DW_OP_pick 1

r0 r1 r2 r3 r0' r1
DW_OP_pick 2

r0 r1 r2 r3 r0' r1 r2
DW_OP_pick 3

r0 r1 r2 r3 r0' r1 r2 r3
<rest of expression>

r0 r1 r2 r3 result
DW_OP_swap

r0 r1 r2 result r3
DW_OP_drop

r0 r1 r2 result
DW_OP_swap

r0 r1 result r2
DW_OP_drop
DW_OP_rot
DW_OP_drop
DW_OP_drop

result

It's possible to do with just DWARF expressions, but the resulting expression may get a little long. We could also just say we support up to 3 arguments and that would probably be fine.

-- adrian

When we have <= 3 arguments (I don’t want to call them registers because they may turn into constants or memory locations later on) we can use DW_OP_rot to move the one we need to adjust to the top and the DW_OP_rot the stack back into the original order. For 4-255 arguments, we need to add extra operations both to the beginning and the end of the expression, here an example for salvaging the 4th element from the top:

That example is how I’d imagine DW_OP_pick/DW_OP_LLVM_register to be used. Personally I’m of the opinion that although we can use DW_OP_rot for cases with arguments <= 3, it would be better to use DW_OP_pick in those cases. One reason is consistency; the other is that as you mentioned, the expressions can get rather long, and this gets worse if we start by using DW_OP_rot and then switch to DW_OP_pick. For example if we have (assuming the first arg is on top of the stack):

%c = add %e, %f

%b = div %d, 7

dv(!DIExpression(DW_OP_plus, DW_OP_mul), %a, %b, %c) ; Short and sweet

dv(!DIExpression(DW_OP_pick, 2, DW_OP_pick, 1, DW_OP_pick, 0, DW_OP_plus, DW_OP_mul), %a, %b, %c) ; Verbose

; Salvage %b

dv(!DIExpression(DW_OP_rot, DW_OP_constu, 7, DW_OP_div, DW_OP_rot, DW_OP_rot, DW_OP_plus, DW_OP_mul), %a, %d, %c) ; Suddenly a lot longer

dv(!DIExpression(DW_OP_pick, 2, DW_OP_pick, 1, DW_OP_constu, 7, DW_OP_div, DW_OP_pick, 0, DW_OP_plus, DW_OP_mul), %a, %d, %c) ; Still longer, but not by as much

; Salvage %c

dv(!DIExpression(DW_OP_pick, 2, DW_OP_pick, 3, DW_OP_plus, DW_OP_pick, 1, DW_OP_pick, 0, DW_OP_rot, DW_OP_constu, 7, DW_OP_div, DW_OP_rot, DW_OP_rot, DW_OP_plus, DW_OP_mul), %a, %d, %e, %f) ; Sudden conversion to use DW_OP_pick; now necessary to pick every argument, while still using DW_OP_rot

dv(!DIExpression(DW_OP_pick, 2, DW_OP_pick, 3, DW_OP_plus, DW_OP_pick, 1, DW_OP_constu, 7, DW_OP_div, DW_OP_pick, 0, DW_OP_plus, DW_OP_mul), %a, %d, %e, %f) ; Now shorter, due to not containing DW_OP_rot

We could avoid this sudden growth in expression length by having salvage debug info attempt to remove the use of DW_OP_rot, but this would make salvaging more complicated as it would need to cover the cases where we:

  1. Salvage a simple expression that doesn’t need rotating or picking: prepend salvage instructions.

  2. Salvage a simple expression that now needs rotation: insert 3x DW_OP_rot, insert salvage instructions at some point in the middle.

  3. Salvage an expression with rotation: determine where salvage instructions need to go and place them.

  4. Salvage an expression with rotation that now needs picking: insert DW_OP_pick for each arg, move salvages from the middle of DW_OP_rot to DW_OP_pick, remove DW_OP_rot.

  5. Salvage an expression with picking: append salvage instructions to the relevant DW_OP_pick operator.

I think the best solution is the one where we keep the “always on the stack” behaviour, so that we can avoid using DW_OP_pick where it doesn’t do anything useful (as in the first case in the example), while keeping salvageDebugInfo simple.