Proposal for multi location debug info support in LLVM IR

Hi folks,

I’ve discussed my desire to have multi location debug info support in LLVM IR with some of you before (and others - apologies if I forgot to CC you). I finally found some time to write down my thoughts about how this should work at the IR level and worked out the proposal below. Please let me know if this a) also does what you want out of it, b) seems like a sane way to encode things in the IR c) you see any obstacles in implementation.
This is a very rough draft, so I expect there may be several iterations, so I’ll try to keep https://gist.github.com/Keno/480b8057df1b7c63c321 updated with the current iteration. Have at it:

Hi folks,

I've discussed my desire to have multi location debug info support in LLVM IR with some of you before (and others - apologies if I forgot to CC you). I finally found some time to write down my thoughts about how this should work at the IR level and worked out the proposal below. Please let me know if this a) also does what you want out of it, b) seems like a sane way to encode things in the IR c) you see any obstacles in implementation.
This is a very rough draft, so I expect there may be several iterations, so I'll try to keep https://gist.github.com/Keno/480b8057df1b7c63c321 updated with the current iteration. Have at it:

Thanks for proposing this! I have a couple of questions to better help me understand the proposal inline:

===============================================================

# What is it? / Why do we want this?

At any given time, the value of a source variable may be available in more than
one place. The most common case is that the variable is available in memory as
well as loaded in a register, but esp. in higher level languages where the
notion of a variable is more disconnected from the physical realities, there can
also be situations where you can find the same value in multiple places in
memory, or perhaps more commonly, their being multiple ways to get to the same
value in memory (e.g. through the GC frame and the argument register).

One can represent this in DWARF (i.e. ranges can overlap),

The DWARF 5 standard says that
"Address range entries in a range list may not overlap.”

The reasoning behind this is presumably that if a variable is in more than one
location at a point all the values need to be identical, or the information is useless.

but perhaps more
importantly one can avoid having to make a choice about which value to track in
mid level optimizations.

I like this motivation better.

The answer will generally depend on which value will
live longer, but at that stage we do not know that information yet. As a
concrete example, InstCombine will currently replace llvm.dbg.declare but
llvm.dbg.values on every load and store, expecting the alloca to get removed,
but if that assumption is wrong, we get worse debug info than we would have
without replacing the declare. With multiple location support, both locations
can be described and either both emitted to DWARF, or we can chose the one that
is live longer and emit that.

As such, there is two separate but related goals:

    1. Allow frontends to describe complex situations where variables may be
       available in more than one location.
    2. Provide a coherent framework for describing the locations of source
       variables in the optimization pipeline to improve debug info quality.

This proposal concerns the IR format for encoding this information. Separately,
getting this info into DWARF will require additional work, some of which has
already been done in D11933 and D11986. The backend work is outside the scope
of this proposal.

# Goals of this design

I tried to come up with a scheme that is a minimal modification of the existing
mechanism to ease upgrading for both frontends and optimizers, but still
separates the three concerns I think are required for multiple locations support

    - Indicating that a value changed at source level (e.g. because an
      assignment occurred)
    - Indicating that the same value is now available in a new location
    - Indicating that a value is no longer available in some location

The last one is required in order to be able describe e.g. stack slot coloring,
where a memory location may cease to describe a variable even though the value
remained the same at source level.

Sounds all good.

# The Proposal

I propose changing the llvm.dbg.value intrinsic from (note I'm ignoring the i64
offset argument which is already essentially dead and I expect it to be removed
soon)

Indeed, soon™.

    void @llvm.dbg.value(metadata, metadata, metadata)

to

    token @llvm.dbg.value(token, metadata, metadata, metadata)

with the semantics being the following:

    - A change of value of a variable is indicated by (pseudeo-IR)

        %first = call token @llvm.dbg.value(token undef, metadata %val,
                                            metadata !var, metadata !expr)

      for the purpose of this proposal, I'll denote such a call (with undef
      first argument) as a `key call`.

    - To add a location with the same value for the same variable, you pass the
      token of the FIRST llvm.dbg.value, as this llvm.dbg.value's first argument
      E.g. to add another location for the variable above:

        %second = call token @llvm.dbg.value(token %first, metadata %val2,
                                            metadata !var, metadata !expr2)

Does this invalidate the first location, or does this add an additional location
to the set of locations for var at this point? If I want to add a third location,
which token do I pass in? Can you explain a bit more what information the token
allows us to express that is currently not possible?

    - To indicate that a location will no longer hold a value, you do the
      following:

        call token @llvm.dbg.value(token %second, metadata token undef,
                                  metadata !var, metadata !())

    - The current set of locations for a variable at a given instruction are all
      those llvm.dbg.value instructions that dominate this location (
      equivalently all those llvm.dbg.value calls whose token you could use at
      that location without upsetting the Verifier), except that if more than
      one key call is dominating, only the most recent one and all calls
      associated to it by first argument count.

I think that should encapsulate the semantics, but here are some consequences
of and comments on the above that I think would be useful to discuss:

    - The upgrade path for existing IR is very simple and just consists of
      adding token undef as the first argument to any call in the IR.

    - In general, if a value gets removed by an optimization, the corresponding
      llvm.dbg.value call can be removed, unless that call is a key call, in
      which case the value should be undefed out. This is necessary both to be
      able to keep it around as the first argument to the other calls, and more
      importantly to mark the end point of a previous set of locations.

So if %val is optimized out in the following example:

  %first = call token @llvm.dbg.value(token undef, metadata %val,
                                      metadata !var, metadata !expr)
  ...
  %second = call token @llvm.dbg.value(token %first, metadata %val2,
                                       metadata !var, metadata !expr2)

Does this turns into:

  call token @llvm.dbg.value(token undef, metadata %undef,
                             metadata !var, metadata !expr)
  %second = call token @llvm.dbg.value(token %undef, metadata %val2,
                                       metadata !var, metadata !expr2)

Or do we still have a %first token, or does the key call get removed entirely, because
the second one is now a key call?

    - I'm, not sure I like the location removal incantation, since it doesn't
      seem super intuitive, however, I did not want to introduce an extra
      intrinsic just for this purpose. The second argument being a token
      guarantees that just undefing out an instruction will not turn a location
      add into a location remove of the key call.

    - It should be noted that for optimized (pseudo-C) source like:

        if (foo) {
            x = a;
        } else {
            x = b;
        }

      the IR would have to look like:

        if.true:
            %xtrue = ... (a)
            call token llvm.dbg.value(token undef, %xtrue, !var, !())
            br cont
        if.false:
            %xfalse = ... (b)
            call token llvm.dbg.value(token undef, %xfalse, !var, !())
            br cont
        cont:
            %x = phi [%xtrue, %if.true], [%xfalse, %if.false]
            call token llvm.dbg.value(token undef, %x, !var, !())

      as the live range of the debug value would end at the end of the
      respective basic block.

    - A related concern is what the following:

        call token llvm.dbg.value(token undef, %xold, !var, !())
        if.true:
            %xtrue = ... (a)
            call token llvm.dbg.value(token undef, %xtrue, !var, !())
            br cont
        if.false:
            %xfalse = ... (b)
            call token llvm.dbg.value(token undef, %xfalse, !var, !())
            br cont
        cont:
            %x = phi [%xtrue, %if.true], [%xfalse, %if.false]

      (i.e. the above but with a forgotten llvm.dbg.value in the cont block).
      By the semantics I have written above, `cont` would again have %xold as
      the value for %x, even though there was an intermediate assignment. I am
      not sure if this represents a problem, but it might at the very least be
      unexpected.

    - Do we run into problems in whatever MSVC's equivalent for debug info is.

    - I think llvm.dbg.declare can be deprecated and it's uses replaced by
      llvm.dbg.value with an DW_OP_deref. That would also clarify the semantics
      of the operation which have caused some confusion in the past.

I think we could already remove it today without any loss of generality (by
lifting any dbg.value whose first argument is an alloca into the MMI table).
What I see this proposal adding is a way to mark the end of a range, which
is important when a value is on the stack only for part of the function (as
in the stack coloring example).

    - We may want to add an extra pass that does debug info inference (some of
      which is done in InstCombine right now)

What kind of inference does InstCombine do currently?

Here are some of the invariants, the verifier would enforce (included in the
hope that they can clarify anything in the above):

    1. If the first argument is not token undef, then
        a. If the second argument is not token undef,
            I. the first argument must be a call to llvm.dbg.value whose first
               argument is token undef
        b. If the second argument is token undef
            II. the first argument must be a call to llvm.dbg.value whose second
                 argument is not token undef
            III. the expression argument must be empty
        c. In either case, the variable described must be the same as the one
           described by the call that is the first argument.
        d. There may not be another call to llvm.dbg.value with token undef
           that dominates this instruction, is not the one passed as the first
           argument and is dominated by the one passed as the first argument.
    2. All other invariants regarding calls to llvm.dbg.value carry over
       unchanged

-- adrian

Thanks for your comments. Replies inline.

The DWARF 5 standard says that
"Address range entries in a range list may not overlap.”

The reasoning behind this is presumably that if a variable is in more than
one
location at a point all the values need to be identical, or the
information is useless

Oh huh, for some reason I was under the impression that they could. No
matter, all we would have to do then is choose one in the backend. I think
it makes sense to maintain the notion of separate multiple locations until
then.

>
> - To add a location with the same value for the same variable, you
pass the
> token of the FIRST llvm.dbg.value, as this llvm.dbg.value's first
argument
> E.g. to add another location for the variable above:
>
> %second = call token @llvm.dbg.value(token %first, metadata
%val2,
> metadata !var, metadata
!expr2)

Does this invalidate the first location, or does this add an additional
location
to the set of locations for var at this point? If I want to add a third
location,
which token do I pass in? Can you explain a bit more what information the
token
allows us to express that is currently not possible?

It adds a second location. If you want to add a third location you pass in
the first token again.
Thus the first call (key call) indicates a change of values, and all
locations that have the same value should use the key call's token.

>
> - To indicate that a location will no longer hold a value, you do the
> following:
>
> call token @llvm.dbg.value(token %second, metadata token undef,
> metadata !var, metadata !())
>
> - The current set of locations for a variable at a given instruction
are all
> those llvm.dbg.value instructions that dominate this location (
> equivalently all those llvm.dbg.value calls whose token you could
use at
> that location without upsetting the Verifier), except that if more
than
> one key call is dominating, only the most recent one and all calls
> associated to it by first argument count.
>
> I think that should encapsulate the semantics, but here are some
consequences
> of and comments on the above that I think would be useful to discuss:
>
> - The upgrade path for existing IR is very simple and just consists
of
> adding token undef as the first argument to any call in the IR.
>
> - In general, if a value gets removed by an optimization, the
corresponding
> llvm.dbg.value call can be removed, unless that call is a key
call, in
> which case the value should be undefed out. This is necessary both
to be
> able to keep it around as the first argument to the other calls,
and more
> importantly to mark the end point of a previous set of locations.

So if %val is optimized out in the following example:

  %first = call token @llvm.dbg.value(token undef, metadata %val,
                                      metadata !var, metadata !expr)
  ...
  %second = call token @llvm.dbg.value(token %first, metadata %val2,
                                       metadata !var, metadata !expr2)

Does this turns into:

  call token @llvm.dbg.value(token undef, metadata %undef,
                             metadata !var, metadata !expr)
  %second = call token @llvm.dbg.value(token %undef, metadata %val2,
                                       metadata !var, metadata !expr2)

Or do we still have a %first token, or does the key call get removed
entirely, because
the second one is now a key call?

I think the situation is the following:
If %second is the only use of %first, we can do that optimization. If not
and %second dominates all uses of first, we could also do this optimization
and replace all uses of %first with %second. However, we cannot remove the
actual first key call, because it denotes the end location for the previous
value of the same variable. Two exceptions I could think of are if %first
is the first call for that variable in the function (as then there can not
be a previous range to terminate) or if there are no other calls or memory
operations in between %first and %second, in which case we could hoist
%second up and merge the two calls. Does that make sense?

>
> - I think llvm.dbg.declare can be deprecated and it's uses replaced
by
> llvm.dbg.value with an DW_OP_deref. That would also clarify the
semantics
> of the operation which have caused some confusion in the past.

I think we could already remove it today without any loss of generality (by
lifting any dbg.value whose first argument is an alloca into the MMI
table).
What I see this proposal adding is a way to mark the end of a range, which
is important when a value is on the stack only for part of the function (as
in the stack coloring example).

Agreed!

>
> - We may want to add an extra pass that does debug info inference
(some of
> which is done in InstCombine right now)

What kind of inference does InstCombine do currently?

I was thinking of replacing llvm.dbg.declare by appropriate llvm.dbg.value
at each load/store.
In the new design that would essentially be an inference pass which would
add those as
locations, with the original one only removed if the alloca actually gets
lifted into registers.

Address ranges in a location list may overlap (section 2.6.2). The entries in a location list are not a range list (which is defined by section 2.17).

–paulr

Thanks for the clarification. That was my recollection as well, but I didn’t know whether that was changed or not.

Thanks for the clarification, Paul!
Keno, just a few more questions for my understanding:

    - Indicating that a value changed at source level (e.g. because an
      assignment occurred)

This is done by a key call.

    - Indicating that the same value is now available in a new location

Additional, alternative locations with identical contents are added by passing in the token from a key call.

    - Indicating that a value is no longer available in some location

This is done by another key call (possibly with an %undef location).

> >
> > - To add a location with the same value for the same variable, you
> pass the
> > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's first
> argument
> > E.g. to add another location for the variable above:
> >
> > %second =3D call token @llvm.dbg.value(token %first, metadata
> %val2,
> > metadata !var, metadata
> !expr2)
>
> Does this invalidate the first location, or does this add an additional
> location
> to the set of locations for var at this point? If I want to add a third
> location,
> which token do I pass in? Can you explain a bit more what information the
> token
> allows us to express that is currently not possible?
>

It adds a second location. If you want to add a third location you pass in
the first token again.
Thus the first call (key call) indicates a change of values, and all
locations that have the same value should use the key call's token.

Ok. Looks like this is going to be somewhat verbose for partial updates of SROA’ed aggregates as in the following example:

// struct s { int i, j };
// void foo(struct s) { s.j = 0; ... }

define void @foo(i32 %i, i32 %j) {
  %token = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32)))
           call llvm.dbg.value(token %token, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32)))
  ...
  
  ; have to repeat %i here:
  %tok2 = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32)))
          call llvm.dbg.value(token %tok2, metadata i32 0, !Struct, !DIExpression(DW_OP_bit_piece(32, 32)))

On the upside, having all this information explicit could simplify the code in DwarfDebug::buildLocationList().

Is there any information in the tokens that could not be recovered by a static analysis of the debug intrinsics?
Note that having redundant information available explicitly is not necessarily a bad thing.

The one difference I noticed so far is that alternative locations allow earlier locations to outlive locations that are dominated by them:
  %loc = dbg.value(%undef, var, ...)
  ...
  %alt = dbg.value(%loc, var, ...)
  ...
  ; alt becomes unavailable
  ...
  ; %loc is still available here.

Any other advantages that I missed?

-- adrian

Thanks for the clarification, Paul!
Keno, just a few more questions for my understanding:

> - Indicating that a value changed at source level (e.g. because an
> assignment occurred)

This is done by a key call.

Correct

> - Indicating that the same value is now available in a new location

Additional, alternative locations with identical contents are added by
passing in the token from a key call.

Correct

> - Indicating that a value is no longer available in some location

This is done by another key call (possibly with an %undef location).

Not quite. Another key call could be used if all locations are now invalid.
However, to just remove a single value, I was proposing

; This is the key call
%first = call token @llvm.dbg.value(token undef, %someloc,
                                  metadata !var, metadata !())

; This adds a location
%second = call token @llvm.dbg.value(token %second, %someotherloc,
                                  metadata !var, metadata !())

; This removes the (%second) location
%third = call token @llvm.dbg.value(token %second, metadata token undef,
                                  metadata !var, metadata !())

Thus, to remove a location you always pass in the token of the call that
added the location. This is also the reason why I'm requiring the second
argument to be `token undef` because no valid location can be of type
token, and I wanted to avoid the situation in which a location gets
replaced by undef everywhere, accidentally turning into a removal of the
location specified by the key call.

> > >
> > > - To add a location with the same value for the same variable,
you
> > pass the
> > > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's
first
> > argument
> > > E.g. to add another location for the variable above:
> > >
> > > %second =3D call token @llvm.dbg.value(token %first, metadata
> > %val2,
> > > metadata !var, metadata
> > !expr2)
> >
> > Does this invalidate the first location, or does this add an additional
> > location
> > to the set of locations for var at this point? If I want to add a third
> > location,
> > which token do I pass in? Can you explain a bit more what information
the
> > token
> > allows us to express that is currently not possible?
> >
>
> It adds a second location. If you want to add a third location you pass
in
> the first token again.
> Thus the first call (key call) indicates a change of values, and all
> locations that have the same value should use the key call's token.
>

Ok. Looks like this is going to be somewhat verbose for partial updates of
SROA’ed aggregates as in the following example:

// struct s { int i, j };
// void foo(struct s) { s.j = 0; ... }

define void @foo(i32 %i, i32 %j) {
  %token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
           call llvm.dbg.value(token %token, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))
  ...

  ; have to repeat %i here:
  %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
          call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

On the upside, having all this information explicit could simplify the
code in DwarfDebug::buildLocationList().

Yeah, this is true. We could potentially extend the semantics by allowing
separate key calls for pieces, i.e.

%token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
           call llvm.dbg.value(token undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

; This now only invalidates the .j part
%tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

In that case we would probably have to require that all DW_OP_bit_pieces in
non-key-call expressions are a subrange of those in the associated key call.

Is there any information in the tokens that could not be recovered by a

static analysis of the debug intrinsics?
Note that having redundant information available explicitly is not
necessarily a bad thing.

I am not entirely sure what you are proposing. You somehow need to be able
to encode which dbg.values invalidate previous locations and which do not.
Since we're describing front-end variables this will generally depend on
front-end semantics, so I'm not sure what a generic analysis pass can do
here without requiring language-specific analysis.

Makes sense. If I understand your comment correctly, the following snippet:

%1 = …
%token = call llvm.dbg.value(token %undef, %1, !var, !())
%2 = …
call llvm.dbg.value(token %token, %undef, !var, !())
call llvm.dbg.value(token %undef, %2, !var, !())

is equivalent to

%1 = …
call llvm.dbg.value(token %undef, %1, !var, !())
%2 = …
call llvm.dbg.value(token %undef, %2, !var, !())

and both are legal.

This way all non-key-call additional locations are describing alternative locations for (a subset of) the bits described the key-call location. Makes sense, and again would simplify the backend’s work.

Right. Determining whether two locations have equivalent contents is not generally decidable.

One thing I’m wondering about is whether we couldn’t design a friendlier (assembler) syntax for the three different use-cases:
%tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
%tok2 = call llvm.dbg.value(token %token, %2, !var, !())%tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())

Could be written as e.g.:

%tok1 = call llvm.dbg.value.new(%1, !var, !())
%tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())%tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())

– adrian

Makes sense. If I understand your comment correctly, the following snippet:

%1 = ...
%token = call llvm.dbg.value(token %undef, %1, !var, !())
%2 = ...
call llvm.dbg.value(token %token, %undef, !var, !())
call llvm.dbg.value(token %undef, %2, !var, !())

is equivalent to

%1 = ...
call llvm.dbg.value(token %undef, %1, !var, !())
%2 = ...
call llvm.dbg.value(token %undef, %2, !var, !())

and both are legal.

Yes

> > >

> > > - To add a location with the same value for the same variable,
you
> > pass the
> > > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's
first
> > argument
> > > E.g. to add another location for the variable above:
> > >
> > > %second =3D call token @llvm.dbg.value(token %first,
metadata
> > %val2,
> > > metadata !var, metadata
> > !expr2)
> >
> > Does this invalidate the first location, or does this add an
additional
> > location
> > to the set of locations for var at this point? If I want to add a
third
> > location,
> > which token do I pass in? Can you explain a bit more what information
the
> > token
> > allows us to express that is currently not possible?
> >
>
> It adds a second location. If you want to add a third location you pass
in
> the first token again.
> Thus the first call (key call) indicates a change of values, and all
> locations that have the same value should use the key call's token.
>

Ok. Looks like this is going to be somewhat verbose for partial updates
of SROA’ed aggregates as in the following example:

// struct s { int i, j };
// void foo(struct s) { s.j = 0; ... }

define void @foo(i32 %i, i32 %j) {
  %token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
           call llvm.dbg.value(token %token, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))
  ...

  ; have to repeat %i here:
  %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
          call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

On the upside, having all this information explicit could simplify the
code in DwarfDebug::buildLocationList().

Yeah, this is true. We could potentially extend the semantics by allowing
separate key calls for pieces, i.e.

%token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
           call llvm.dbg.value(token undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

; This now only invalidates the .j part
%tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

In that case we would probably have to require that all DW_OP_bit_pieces
in non-key-call expressions are a subrange of those in the associated key
call.

This way all non-key-call additional locations are describing alternative
locations for (a subset of) the bits described the key-call location. Makes
sense, and again would simplify the backend’s work.

Yes, I think that's a reasonable change to the semantics, so let's make it
so.

One thing I’m wondering about is whether we couldn’t design a friendlier
(assembler) syntax for the three different use-cases:
  %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
  %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
  %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())

Could be written as e.g.:

  %tok1 = call llvm.dbg.value.new(%1, !var, !())
  %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
  %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())

Yeah, I would be ok with that (and think it's a good idea).

I have updated the gist to reflect the following changes:

  • Split llvm.dbg.value into three intrinsics. Note that in the process, I deleted the !var argument from .add and .delete and the expr argument from .delete since they are now necessarily redundant.
  • Key calls describing pieces of variables are independent, and extra locations may only be added to subset of the piece described in the
    original key call.