[RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

Hi all,

The debug entry values feature introduces new DWARF symbols (tags, attributes, operations) on caller (call site) as well as on callee side; and the intention is to improve debugging user experience by using the functionality (especially in “optimized” code by turning “<optimized_out>” values into real values). The call site information includes info about call itself (described with DW_TAG_call_site) with corresponding children representing function arguments at the call site (described with DW_TAG_call_site_params). The most interesting DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF expression which represents a value of the parameter at the time of the call. For the context of this RFC, more relevant part of the feature is the callee side, and it refers to new DWARF operation - DW_OP_entry_value, used to indicate that in some situations we can use parameter’s entry value as a real value in the current frame. It relies on the call-site info provided, and the more DW_AT_call_value generated, the more debug location inputs using DW_OP_entry_value will be turned into real values.

Current implementation in LLVM

Currently in LLVM, we generate the DW_OP_entry_values only for unmodified parameters during the LiveDebugValues pass, for the places where the Code Generation truncated live range of the parameters. The potential of the functionality goes beyond this, and it means we should be able to use the entry values even for modified parameters iff the modification could be expressed in terms of its entry value. In addition, there are cases where we can express values of local variables in terms of some parameter’s entry-values (e.g. int local = param + 2;).

Proposal

The idea of this RFC is to introduce an idea/discussion of using the DW_OP_entry_value not only at the end of LLVM pipeline (within LiveDebugValues). There are cases it could be useful at IR level; i.e. for unused arguments (please take a look into https://reviews.llvm.org/D85012); I believe there are a lot of cases where an IR pass drops/cuts variable’s debug value info where an entry value can fall back on as a backup location. There could be multiple ways of implementation, but in general, we need to extend metadata describing the debug value to support/refer to entry value/backup value as well (and when primary location is lost, the value with DW_OP_entry_value becomes the primary one). One way could be extending of llvm.dbg.value with an additional operand as following:

llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) // DIEntryValExpression implicitly contains DW_OP_entry_value operation

The bottom line is that the production of call-site side of the feature stays the same, but LLVM will have more freedom to generate more of DW_OP_entry_values operation on the callee side.

Any thoughts on this?

Best regards,

Djordje

(+ a few other folks from Google interested in increased optimized debug info location nifo)

I don’t have much context for the variable location part of LLVM’s DWARF handling - I’ve mostly been leaving that to other folks, so take anything I say here with a grain of salt.

My thinking would be that dbg.values for variable locations, and dbg.values for “backup” entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I’m not sure - maybe it’s similar to what you’re suggesting (perhaps you could show a more fleshed out example? even for a simple function “void f1(int i) { f2(); f3(i); f2(); }” or something. I guess I would’ve imagined maybe a way for the dbg.value to include an extra bit saying “I’m an entry value expression” - oh, but I see, there’s no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying “this is a backup/entry_value based location” is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc.

I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don’t do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct)

Hi!

Hi all,

The debug entry values feature introduces new DWARF symbols (tags, attributes,
operations) on caller (call site) as well as on callee side; and the intention
is to improve debugging user experience by using the functionality (especially
in “optimized” code by turning “<optimized_out>” values into real values). The
call site information includes info about call itself (described with
DW_TAG_call_site) with corresponding children representing function arguments
at the call site (described with DW_TAG_call_site_params). The most interesting
DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF
expression which represents a value of the parameter at the time of the call.
For the context of this RFC, more relevant part of the feature is the callee
side, and it refers to new DWARF operation - DW_OP_entry_value, used to
indicate that in some situations we can use parameter’s entry value as a real
value in the current frame. It relies on the call-site info provided, and the
more DW_AT_call_value generated, the more debug location inputs using
DW_OP_entry_value will be turned into real values.

Current implementation in LLVM

Currently in LLVM, we generate the DW_OP_entry_values *only* for unmodified
parameters during the LiveDebugValues pass, for the places where the Code
Generation truncated live range of the parameters. The potential of the
functionality goes beyond this, and it means we should be able to use the entry
values even for modified parameters iff the modification could be expressed in
terms of its entry value. In addition, there are cases where we can express
values of local variables in terms of some parameter’s entry-values (e.g. int
local = param + 2;).

Proposal

The idea of this RFC is to introduce an idea/discussion of using the
DW_OP_entry_value not only at the end of LLVM pipeline (within
LiveDebugValues). There are cases it could be useful at IR level; i.e. for
unused arguments (please take a look into
https://protect2.fireeye.com/v1/url?k=16c671b9-4876ec21-16c63122-861fcb972bfc-e4488a7f57de3412&q=1&e=4f293e8b-6a1f-4a80-9de1-30399c7295a6&u=https%3A%2F%2Freviews.llvm.org%2FD85012
); I believe there are a lot of cases where an IR pass drops/cuts variable’s
debug value info where an entry value can fall back on as a backup location.
There could be multiple ways of implementation, but in general, we need to
extend metadata describing the debug value to support/refer to entry
value/backup value as well (and when primary location is lost, the value with
DW_OP_entry_value becomes the primary one). One way could be extending of
llvm.dbg.value with an additional operand as following:

               llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) //
DIEntryValExpression implicitly contains DW_OP_entry_value operation

The bottom line is that the production of call-site side of the feature stays
the same, but LLVM will have more freedom to generate more of
DW_OP_entry_values operation on the callee side.

Any thoughts on this?

I just want to add that I think it would neat if the entry values could map into
multi-location dbg.values and DBG_VALUEs that are being proposed on this list.

For example, if we have:

  int local = param1 + param2 + 123;

I think it would be good if we would be able to to represent the four different
permutations of the values of the parameters being available in the function or
as entry values.

I have not yet delved into the discussion about the multi-location debug values,
so I don't have any proposals for how that could look.

Best regards,
David

Hi David,

Thanks for your feedback.

My thinking would be that dbg.values for variable locations, and dbg.values for “backup” entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I’m not sure - maybe it’s similar to what you’re suggesting (perhaps you could show a more fleshed out example? even for a simple function “void f1(int i) { f2(); f3(i); f2(); }” or something. I guess I would’ve imagined maybe a way for the dbg.value to include an extra bit saying “I’m an entry value expression” - oh, but I see, there’s no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying “this is a backup/entry_value based location” is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc.

We have an LLVM-internal operation (DW_OP_LLVM_entry_value), but I think we might be needing something different/more complex (e.g. a flag that indicates it is an entry_value/backup; since it needs to coexist with the real value). An alternative could be a separate intrinsic llvm.dbg.entry_val(), but I think we all want to avoid extra Intrinsics if possible.

I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don’t do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct)

I guess we’d need something like that, but “on-the-fly” model will be more acceptable. Or, a separate IR pass for that purpose, but it would introduce some extra overhead…

Best regards,
Djordje

Hi David,

Thanks for your comments!

I just want to add that I think it would neat if the entry values could map into
multi-location dbg.values and DBG_VALUEs that are being proposed on this list.

For example, if we have:

int local = param1 + param2 + 123;

I think it would be good if we would be able to to represent the four different
permutations of the values of the parameters being available in the function or
as entry values.

I have not yet delved into the discussion about the multi-location debug values,
so I don’t have any proposals for how that could look.

I guess it can (somehow) be mapped into that. It is clear to me that the usage of the DBG_VALUE_LIST will be appropriate for the “Salvage Debug Info”, but the idea of using the entry values on IR level is more general (not very localized), and there is the cause of potential complexity, since we need to carry that info throughout IR and use it as a backup.

Best regards,
Djordje

Hi,

If we consider simple test case:

void f1(int);
void f2(int i) {
  f1(1);
  i = i + 5;
  f1(3);
}

we can see what are the benefits we get after enabling the DW_OP_entry_value on IR. Please consider looking at https://reviews.llvm.org/D87233 for more details.
On a very high level, we introduce a utility within Transforms/Utils/Local.cpp, that will inspect the IR by trying to find an entry Value for a DIVariable. By introducing such utility, we don’t need to carry/introduce any additional info to llvm.dbg.val or DIVariable.
The utility can be used at various places (similar to salvageDebugInfo()) during the middle-end phase.

Besides this, the https://reviews.llvm.org/D85012 is another usage of DW_OP_entry_value outside of the LiveDebugValues.

Best,
Djordje

Hi Djordje,

[Late reply as I was away, alas],

For the example in https://reviews.llvm.org/D85012 , I'm not sure that
just using an entry value is correct. The reason why the dbg.values
for arguments are set to undef is not because the value can't be
described, it's because deadargelim changes all the call sites to pass
in 'undef', which I believe makes the value unrecoverable even with
entry values. Can call sites like that be described by DWARF? If so, I
guess a combination of entry-value variable locations and salvaging
the call-site arguments would work.

The isel example in https://reviews.llvm.org/D87233 is good --
although dropping the variable location there is largely because of
SelectionDAGs own limitations rather than the generated code. We can't
describe arguments that aren't used because they're not copied to a
virtual register.

In general, I reckon entry values will be a key part of recovering
variable locations, however I'm not sure that there are many places
where it's truly necessary in LLVM-IR. deadargelim is definitely one
of them; but even then, the Argument remains in the function
signature. We're still able to refer to the Argument in dbg.values, in
program positions where the argument will be available, and where it
will be unavailable. In my opinion, entry values are probably most
useful when things go out of liveness; however we don't know what's in
or out of liveness until the register allocator runs.

As you say, we could speculatively produce entry value expressions as
a "backup". I reckon this will work fine, although there'll be some
un-necessary work involved due to it being speculative. I kind of have
a different vision for how this could be done though, and not just for
entry values. It hinges on the instruction referencing variable
location stuff I've been working on: I believe we can use that to
connect variable values at the end of compilation back to LLVM-IR
Values. If that's achievable, we'll have:
* Information about the set of LLVM-IR Values that are live, and what
physical registers they're in,
* The IR itself, which contains the target-independent relationships
between Values,
After which we will have enough information at the end of compilation
to directly answer the question "How can we describe this variable
value using only Values that are live?", effectively doing very late
salvaging. We can consider entry values to be always available, and
should be able to recover any value that _can_ be described in terms
of entry values.

That's assuming that the instruction referencing stuff works out. I
can elaborate and give some examples if wanted. My overarching feeling
is that it'd be great to avoid doing extra working during compilation,
instead leaving things until the end when there's more information
available.

Hi Jeremy,

Thanks a lot for your feedback.

For the example in https://reviews.llvm.org/D85012 , I’m not sure that
just using an entry value is correct. The reason why the dbg.values
for arguments are set to undef is not because the value can’t be
described, it’s because deadargelim changes all the call sites to pass
in ‘undef’, which I believe makes the value unrecoverable even with
entry values. Can call sites like that be described by DWARF? If so, I
guess a combination of entry-value variable locations and salvaging

the call-site arguments would work.

Using entry-values (‘callee’ side of the feature) is not enough in any case. It is always connected to the call-site-param (function arguments but we call it call-site-params; ‘caller’ side of the feature) debug info. I believe that there are call-site-params that could be expressed in terms of DWARF for the cases we face within deadargelim. GCC does perform correct output for both caller and callee sides for unused params.

As you say, we could speculatively produce entry value expressions as
a “backup”. I reckon this will work fine, although there’ll be some
un-necessary work involved due to it being speculative. I kind of have
a different vision for how this could be done though, and not just for
entry values. It hinges on the instruction referencing variable
location stuff I’ve been working on: I believe we can use that to
connect variable values at the end of compilation back to LLVM-IR
Values. If that’s achievable, we’ll have:

  • Information about the set of LLVM-IR Values that are live, and what
    physical registers they’re in,
  • The IR itself, which contains the target-independent relationships
    between Values,
    After which we will have enough information at the end of compilation
    to directly answer the question “How can we describe this variable
    value using only Values that are live?”, effectively doing very late
    salvaging. We can consider entry values to be always available, and
    should be able to recover any value that can be described in terms
    of entry values.

That’s assuming that the instruction referencing stuff works out. I
can elaborate and give some examples if wanted.

Please share that work when you are ready.

My overarching feeling
is that it’d be great to avoid doing extra working during compilation,
instead leaving things until the end when there’s more information

available.

I agree with the statement (it’d be better if possible).

Best,
Djordje

Hi Djordje,

Using entry-values ('callee' side of the feature) is not enough in any case. It is always connected to the call-site-param (function arguments but we call it call-site-params; 'caller' side of the feature) debug info. I believe that there are call-site-params that could be expressed in terms of DWARF for the cases we face within deadargelim. GCC does perform correct output for both caller and callee sides for unused params.

Ah, that covers my concerns. This is definitely a worthy cause then --
especially as parameters are usually considered more important to
preserve than other variables.

Djordje

Please share that work when you are ready.

Sure, explanation below: note that I'm bringing this up now because I
see producing entry-value "backup" locations as a technique to recover
from the register allocator clobbering things, and I feel the below is
a more general solution.

I'd like to use this (contrived) code as an illustrative example:

    void ext(long);
    void foo(long *ptr, long bar, long baz) {
      for (long i = 0; i < bar; ++i) {
        long index = baz + i;
        long *curptr = &ptr[index];
        ext(*curptr);
      }
    }

All it does is iterate over a loop, loading values from an offset into
a pointer. I've compiled this at -O2, and then given it an additional
run of -loop-reduce with opt [0]. During optimisation, LLVM rightly
identifies that the 'baz' offset is loop-invariant, and that it can
fold some of the offset calculation into the loop preheader. This then
leads to both 'ptr' and 'baz' being out of liveness, and being
clobbered in the body of the loop. In addition, the 'index' variable
is optimised out too, and that's the variable I'd like to focus on.

Today, we're not able to describe 'index' in the IR after
-loop-reduce, but I'm confident that the variadic variable locations
work will make that possible. I'm going to assume that we can describe
such locations for the rest of this email.

"index" could be described by using the entry value of 'baz' and
adding it to 'i', which remains in liveness throughout. To produce a
"backup" location though, we would have to guess that 'baz' would go
out of liveness in advance, and speculatively produce the expression.
I reckon that we can instead calculate the location at end of
compilation by using the SSA-like information from instruction
referencing. Here's the MIR for the reduced loop body, using
instruction-referencing [1] and lightly edited to remove noise, with
only variable locations for the 'i' variable. I've added some
explanatory comments:

    DBG_PHI $rbx, 2
    DBG_INSTR_REF 2, 0, !16, !DIExpression(), debug-location !23
    ; This is the load from *curptr:
    renamable $rdi = MOV64rm renamable $r15, 8, renamable $rbx
    ; Call to ext,
    CALL64pcrel32 @ext, csr_64, [implicit defs]
    ; Loop increment:
    renamable $rbx = nuw nsw ADD64ri8 killed renamable $rbx, 1,
debug-instr-number 1
    DBG_INSTR_REF 1, 0, !16, !DIExpression(), debug-location !23
    CMP64rr renamable $r14, renamable $rbx, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit $eflags

The label "debug-instr-number 1" on the ADD64ri8 identifies the ADD as
corresponding to the loop increment, and the DBG_PHI for $rbx as the
position where the loop PHI occurs. My key observation is that there
is a one-to-one relationship between LLVM-IR Values and these
end-of-compilation instruction numbers [2]. If we stored a mapping
during instruction selection of Value <=> instruction reference, at
the end of compilation we would be able to salvage variable locations
that had gone out of liveness.

Imagine for a moment that we described the "index" variable as a
variadic variable location, possibly looking like this:

    DBG_INSTR_REF {3, 0}, {2, 0}, !17, !DIExpression(DW_OP_LLVM_arg,
0, DW_OP_LLVM_arg, 1, DW_OP_plus)

Where the {3, 0} instruction number referred to the 'baz' argument,
and {2, 0} the value of 'i' on entry to the loop body. The workflow
for salvaging would look something like this, after LiveDebugValues
has finished doing dataflow things:
  1) Examine instruction reference {3, 0},
  2) Observe that it's out of liveness in the current location (the loop body),
  3) Look up the LLVM-IR Value that {3, 0} corresponds to, finding the
Argument in LLVM-IR,
  4) Because it's an Argument, replace DW_OP_LLVM_arg, 0 with the
corresponding entry value expression,
  5) Emit variable location.

This is harder than just speculating how we might salvage the location
earlier in compilation, but is more direct, and involves no
un-necessary work. Additionally, it's not limited to entry values: for
any value that goes out of liveness that was computed by a side-effect
free instruction, we could:
  4) For each operand of the corresponding LLVM-IR Instruction,
  4.1) Identify the instruction number of this operand,
  4.2) Confirm that that number is still in liveness (if not: abort),
  5) Compute an expression that recomputes the Value using the
locations of the operands,
  6) Emit variable location.

We could even go the other way and recover a value from other
computations that used the value (if an inverse operation exists).

Hi Jeremy,

Thanks for proposing that.
First of all, I think that all the dbg-instr-ref work can give us a lot of benefits (since handling of llvm.dbg.value() intrinsic is easier, indeed) when implemented/committed, so thanks for that.
I completely like the idea you have described, and there are a couple of questions/comments:

  1. The entry-values-as-“backups” could be used this way, but we firstly need to have the DBG_INSTR_REF in use. I don’t see an overlapping with the way I suggested for current “non-ref-dbg-values” at MIR. (?)
  2. This will be an improvement of the DBG_INSTR_REF, since it needs the variadic form of the instruction; and we don’t have it at the moment? As you have pointed out, by having the variadic form of the instruction, we can salvage “non-entry” values as well, I guess.

Best regards,
Djordje

Hi Djordje,

The entry-values-as-"backups" could be used this way, but we firstly need to have the DBG_INSTR_REF in use. I don't see an overlapping with the way I suggested for current "non-ref-dbg-values" at MIR. (?)

Indeed, there's no overlap between these two ideas -- yours is
producing expressions early and consuming late, while mine is both
producing and consuming late. Hence I wanted to get the idea out for
discussion before either are really pursued.

This will be an improvement of the DBG_INSTR_REF, since it needs the variadic form of the instruction; and we don't have it at the moment? As you have pointed out, by having the variadic form of the instruction, we can salvage "non-entry" values as well, I guess.

Indeed, although I think it'll be a bit easier than Stephens
DBG_VALUE_LIST implementation, as the locations won't need maintenance
through the rest of CodeGen -- we would only need to generate
DBG_INSTR_REFs with multiple operands, then consider them at the end
of compilation.

A note on timescales, I don't see any of the instruction-referencing
work as likely to be "on by default" any time soon. It'll need some
comprehensive testing on large binaries, plus GlobalISel and aarch64
support, which I haven't thought about so far. It's worth pointing out
that producing backup entry values is something that could be
implemented and work almost immediately, and deliver benefits in the
next release, wheras the late-salvaging way definitely has a long
horizon.