[RFC] DebugInfo: A different way of specifying variable locations post-isel

Hi debuginfo cabal,

tl;dr: I'd like to know what people think about an alternative to
DBG_VALUE instructions describing variable locations in registers,
virtual or real. Before instruction selection in LLVM-IR we identify
the _values_ of variables [0] by the instruction that computes the
value; I believe we should be able to do the same post-isel, and it
would avoid having to analyse register locations across regalloc and
numerous optimisations. Or written another way: why don't we track the
value of variables through backend codegen, and then determine a
register location very late?

This is just an idea with no solid proposal of work. IMO this would
reduce the amount of code and complexity involved in preserving
variable locations. It would also help eliminate debug instructions in
a far flung future.

Background:

In optimised LLVM-IR, we specify a variable location like so:

  %2 = someinst %1, %0
  call @llvm.dbg.value(metadata i32 %2, ...)

A dbg.value intrinsic call specifies two things about a variable:
* The SSA-register / otherwise that is the value of the variable, and,
* The position in the instruction stream where that SSA-register
becomes the variable location.

I'm using the term "machine location" and "program location"
throughout this email to mean the two items above, respectively. This
representation is good for LLVM-IR: the SSA-register machine location
entirely and uniquely identifies a computation, the value of which
should appear as the value of the variable in a debugger.

Post-isel, the same sequence is represented by:

  %2 = some-machine-inst %1, %0
  DBG_VALUE %2, ...

Which to a large extent means the same thing. However, there are some
subtle differences that manifest as the function proceeds through the
codegen pipeline:
* The specified virtual register (%0) doesn't always contain the
value produced by "some-machine-inst". Once we leave SSA-form, there
can be multiple def's of the vreg after PHI-elimination / register
coalescing.
* The vreg does not uniquely identify the value produced by
"some-machine-inst": COPY instructions introduced during SelectionDAG
/ PHI-elimination / other passes place the value into multiple vregs,
that can have different liveness ranges.

The problem:

Those two differences between dbg.value intrinsics and DBG_VALUE
instructions introduce some annoying artifacts that make handling
DBG_VALUEs harder than dbg.values:
* Identical DBG_VALUEs at different program locations can result in
different variable values being presented (because their vreg operand
might refer to a different def),
* There can be multiple ways to represent a dbg.value in DBG_VALUEs
(as you have a choice of vregs from COPY instructions), some with
different lifetimes.

Both of which make the movement and preservation of DBG_VALUEs much
more context-dependent than the LLVM-IR equivalent. It's a lot easier
to cause an incorrect value to appear in a debugger at this stage of
compilation, or limit the range over which we preserve a variable
location.

There are currently three instruction scheduling passes in LLVM
(machine-scheduler, postra scheduler, SelectionDAG does some too)
which don't have any principled approach to preserving the correctness
of variable locations, and are vulnerable to the artifacts above. The
first two just glue DBG_VALUEs to the preceeding machine instruction
and move them around together (vulnerable to assignment reordering and
referring to the wrong {v,}reg def), the latter can re-order
assignments but also finds it hard to select the longest-living vreg,
which I wrote up in [1]. Correctly scheduling DBG_VALUEs to always:
* refer to the correct vreg def,
* With the longest lifetime,
* without re-ordering assignments,
is sufficiently hard that no-one has attempted it to my knowledge, and
I believe it would be really difficult to get right. Additionally, if
we were to generate DBG_VALUE $noreg instructions when rescheduling
(to terminate earlier variable locations), and then a subsequent
scheduling pass undoes that rescheduling (or some part of it), we will
lose or shorten variable locations for no reason.

Finally, being forced to always specify both the machine location and
the program location at the same time (in a single DBG_VALUE)
introduces un-necessary burdens. In MachineSink, when we sink between
blocks an instruction that defines a vreg, we chose to sink DBG_VALUE
instructions referring to that vreg too to avoid losing the variable
location. This un-necessarily risks re-ordering assignments, and in
some circumstances [2] you would have to examine all the instructions
in the function to work out whether sinking a DBG_VALUE would be
legal. In SimpleRegisterCoalescing, when we merge two vregs,
DBG_VALUEs can only refer to the surviving vreg -- and at the
DBG_VALUEs location that vreg might not contain the right def. There
may be other machine locations where the correct value is available
(it may even be rematerialized later), but searching for it is hard;
right now we just drop variable location information in these cases.

A solution:

[To be clear, I haven't tried to implement this idea yet as I wanted feedback,]

I'd like to suggest that we can represent variable locations in the
codegen backend / MIR with three things:
* The instruction that defines the value of the variable,
* The operand of that instruction into which the value is written,
* The position in the instruction stream where the assignment of this
value to the variable occurs

That's effectively modifying a machine location from being a {v,}reg,
into being a "defining instruction" and operand. This is closer to the
LLVM-IR form of a machine location, where the SSA Value and its
computation are synonymous. Exactly how this is represented in-memory
and in-printed-MIR I haven't thought a lot about; probably by
attaching metadata to instructions and having DBG_VALUE use a metadata
operand rather than referring to a vreg. Specifying machine locations
like this would have the following benefits:
* Both DBG_VALUEs and defining instructions are independent and can
be moved within the function without loss of information, and without
needing to consider so much context,
* Likewise, vregs can be rewritten / merged / deleted without the
need to update any debug metadata. Only instruction deletion /
morphing would need some sort of change,
* We would never need to refer to COPYs, avoiding artifical liveness
limitations,
* Debug use before defs would become tolerable (see below), and
possibly even be a good way of describing locations after
optimisations.

This would not eliminate the risk of re-ordering variable assignments.

The three instruction scheduling passes would become significantly
easier to deal with: they would only have to replace DBG_VALUE
instructions in the correct order, not worry about their operands.
Various debug facilities in SimpleRegisterCoalescing, MachineSink, and
large amounts of LiveDebugVariables would become redundant, as we
wouldn't need to maintain a register location through optimisations.

Finally, this design could be extended to not having any instructions
in the instruction stream. Once machine locations aren't described
within a MachineOperand, the most important thing a DBG_VALUE
signifies is a position in the instruction stream, which could be
performed in some other way (i.e., more metadata) in the future.

How then do we translate this new kind of machine location into
DWARF/CodeView variable locations, which need to know which register
to look in? The answer is: LiveDebugValues [3]. We already perform a
dataflow analysis in LiveDebugValues of where values "go" after
they're defined: we can use that to take "defining instructions" and
determine register / stack locations. We would need to track values
from the defining instruction up to the DBG_VALUE where that value
becomes a variable location, after which it's no different from the
LiveDebugValues analysis that we perform today. LiveDebugValues'
ability to track values through stack spills and restores would become
a critical feature (it isn't today), as we would no longer generate
stack locations during register allocation.

I reckon debug-use-before-def's can be tolerated in this
representation, and even be well defined and useful, reducing the work
needed to be done earlier in the compiler. Under the model described
above, we can specify a program location before the corresponding
machine location containing the variable value machine location
becomes available. Consider this code:

  DBG_VALUE output-of-this-inst ---
  someinst1 |
  someinst2 |
  $rax = ADD32ri $rax, 0 <-----

Where the line from DBG_VALUE to ADD32ri represents some
as-yet-undetermined way of identifying the ADD32ri instruction from
the DBG_VALUE. We can interpret such a code sequence as the variable
having no location across someinst1 and someinst2, which are not
dominated by the defining instruction, then a location of $rax after
the ADD32ri. Essentially:
* For an instruction dominated by a DBG_VALUE but not by the defining
instruction, the variable location is empty / undef / $noreg,
* For an instruction dominated by both, the variable location is
defined as it is today.

This should work across control flow, and doesn't necessitate the
creation of DBG_VALUE $noreg's to explicitly describe unavailable
locations when instructions move. In theory, if we were to accept
debug use-before-defs in LLVM-IR, this would reduce analysis and mean
fewer dbg.value(undef,...)'s would need to be created earlier in the
compiler.

Limitations

The largest problem with this idea is that not all variable values are
defined by instructions: PHIs are values that are defined by control
flow. To deal with this pre-regalloc, we could move LiveDebugVariables
to run before phi-elimination. My understanding is that the register
allocation phase of LLVM starts there and ends after virtregrewriter,
and it'd be legitimate to say "we do special things for these passes".
After regalloc however, there would need to be some way of specifying
a block and a register, where entry to the block defines a variable
value in that register. This isn't pretty; but IMO is the
representation closest to the truth. Passes like tail duplication and
branchfolder might need to perform debuginfo maintenence when they
altered blocks -- however I believe these circumstances are rare, as
few control flow changes happen after regalloc. It (IMO) would be
worth it given the other benefits.

I also haven't considered the impact of this on -O0: one would imagine
it would be easier to deal with than optimised builds though.

Discussion

I feel like this would be a better way of representing variable
locations in the codegen backend; my fear is that this is a lot of
work, and I don't know what appetite there is for change amongst other
interested parties. Thus I'd be interested in any kind of feedback as
to whether a) this is a good idea, b) whether this category of change
is what people want, and c) whether this is seen as being achievable.

Being able to introduce this change incrementally presents some
challenges: while the way of representing variable locations described
above is more expressive than the current way, converting between one
and the other requires running the LiveDebugValues analysis, which
makes moving transparently between the two hard to do. Moving
backwards through the backend, from emission towards the start might
be doable though.

This introduces some additional complexity into a pass
(LiveDebugValues) that's been difficult to understand and reason about
in the past. In my opinion, given that we have to perform this
dataflow analysis at the end of compilation to propagate variable
locations anyway, it would be worthwhile to harness it to remove the
need for complexity elsewhere. Some of the problems I've described
above need their own dataflow analyses to be both sound and complete:
IMO it would be better to record the bare minimum of facts and then
interpret them at the end of compilation.

Happily there are "only" 130 tests that input or output MIR in
llvm/test/DebugInfo, so this doesn't involve rewriting *every* single
test that there is.

[0] You could consider an SSA register a "location" too, my point is
that it's both a value and a location.
[1] 41583 – [DebugInfo@O2] SelectionDAGs debug inst scheduling is imperfect and often broken
[2] 44117 – [DebugInfo@O2] MachineSink can unsoundly extend variable location ranges
[3] You knew it was coming!

Hi debuginfo cabal,

tl;dr: I'd like to know what people think about an alternative to
DBG_VALUE instructions describing variable locations in registers,
virtual or real. Before instruction selection in LLVM-IR we identify
the _values_ of variables [0] by the instruction that computes the
value; I believe we should be able to do the same post-isel, and it
would avoid having to analyse register locations across regalloc and
numerous optimisations. Or written another way: why don't we track the
value of variables through backend codegen, and then determine a
register location very late?

This is just an idea with no solid proposal of work. IMO this would
reduce the amount of code and complexity involved in preserving
variable locations. It would also help eliminate debug instructions in
a far flung future.

Background:

In optimised LLVM-IR, we specify a variable location like so:

%2 = someinst %1, %0
call @llvm.dbg.value(metadata i32 %2, ...)

A dbg.value intrinsic call specifies two things about a variable:
* The SSA-register / otherwise that is the value of the variable, and,
* The position in the instruction stream where that SSA-register
becomes the variable location.

I'm using the term "machine location" and "program location"
throughout this email to mean the two items above, respectively. This
representation is good for LLVM-IR: the SSA-register machine location
entirely and uniquely identifies a computation, the value of which
should appear as the value of the variable in a debugger.

Post-isel, the same sequence is represented by:

%2 = some-machine-inst %1, %0
DBG_VALUE %2, ...

Which to a large extent means the same thing. However, there are some
subtle differences that manifest as the function proceeds through the
codegen pipeline:
* The specified virtual register (%0) doesn't always contain the
value produced by "some-machine-inst". Once we leave SSA-form, there
can be multiple def's of the vreg after PHI-elimination / register
coalescing.
* The vreg does not uniquely identify the value produced by
"some-machine-inst": COPY instructions introduced during SelectionDAG
/ PHI-elimination / other passes place the value into multiple vregs,
that can have different liveness ranges.

The problem:

Those two differences between dbg.value intrinsics and DBG_VALUE
instructions introduce some annoying artifacts that make handling
DBG_VALUEs harder than dbg.values:
* Identical DBG_VALUEs at different program locations can result in
different variable values being presented (because their vreg operand
might refer to a different def),
* There can be multiple ways to represent a dbg.value in DBG_VALUEs
(as you have a choice of vregs from COPY instructions), some with
different lifetimes.

Both of which make the movement and preservation of DBG_VALUEs much
more context-dependent than the LLVM-IR equivalent. It's a lot easier
to cause an incorrect value to appear in a debugger at this stage of
compilation, or limit the range over which we preserve a variable
location.

There are currently three instruction scheduling passes in LLVM
(machine-scheduler, postra scheduler, SelectionDAG does some too)
which don't have any principled approach to preserving the correctness
of variable locations, and are vulnerable to the artifacts above. The
first two just glue DBG_VALUEs to the preceeding machine instruction
and move them around together (vulnerable to assignment reordering and
referring to the wrong {v,}reg def), the latter can re-order
assignments but also finds it hard to select the longest-living vreg,
which I wrote up in [1]. Correctly scheduling DBG_VALUEs to always:
* refer to the correct vreg def,
* With the longest lifetime,
* without re-ordering assignments,
is sufficiently hard that no-one has attempted it to my knowledge, and
I believe it would be really difficult to get right. Additionally, if
we were to generate DBG_VALUE $noreg instructions when rescheduling
(to terminate earlier variable locations), and then a subsequent
scheduling pass undoes that rescheduling (or some part of it), we will
lose or shorten variable locations for no reason.

Finally, being forced to always specify both the machine location and
the program location at the same time (in a single DBG_VALUE)
introduces un-necessary burdens. In MachineSink, when we sink between
blocks an instruction that defines a vreg, we chose to sink DBG_VALUE
instructions referring to that vreg too to avoid losing the variable
location. This un-necessarily risks re-ordering assignments,

So under the proposed scheme, would the dbg value not be sunk? Ah, so then you’d get the dbg use before def scenario, which you argue has some nice properties below.

and in
some circumstances [2] you would have to examine all the instructions
in the function to work out whether sinking a DBG_VALUE would be
legal. In SimpleRegisterCoalescing, when we merge two vregs,
DBG_VALUEs can only refer to the surviving vreg -- and at the
DBG_VALUEs location that vreg might not contain the right def. There
may be other machine locations where the correct value is available
(it may even be rematerialized later), but searching for it is hard;
right now we just drop variable location information in these cases.

A solution:

[To be clear, I haven't tried to implement this idea yet as I wanted feedback,]

I'd like to suggest that we can represent variable locations in the
codegen backend / MIR with three things:
* The instruction that defines the value of the variable,
* The operand of that instruction into which the value is written,
* The position in the instruction stream where the assignment of this
value to the variable occurs

That's effectively modifying a machine location from being a {v,}reg,
into being a "defining instruction" and operand. This is closer to the
LLVM-IR form of a machine location, where the SSA Value and its
computation are synonymous. Exactly how this is represented in-memory
and in-printed-MIR I haven't thought a lot about; probably by
attaching metadata to instructions and having DBG_VALUE use a metadata
operand rather than referring to a vreg. Specifying machine locations
like this would have the following benefits:
* Both DBG_VALUEs and defining instructions are independent and can
be moved within the function without loss of information, and without
needing to consider so much context,
* Likewise, vregs can be rewritten / merged / deleted without the
need to update any debug metadata. Only instruction deletion /
morphing would need some sort of change,

For deletion, that makes sense, we might consider introducing an MI level salvageDI method to help with that.

For replacement/morphing, wouldn’t MachineInstr::RAUW be able to update the new-style debug uses?

* We would never need to refer to COPYs, avoiding artifical liveness
limitations,
* Debug use before defs would become tolerable (see below), and
possibly even be a good way of describing locations after
optimisations.

This would not eliminate the risk of re-ordering variable assignments.

The three instruction scheduling passes would become significantly
easier to deal with: they would only have to replace DBG_VALUE
instructions in the correct order, not worry about their operands.
Various debug facilities in SimpleRegisterCoalescing, MachineSink, and
large amounts of LiveDebugVariables would become redundant, as we
wouldn't need to maintain a register location through optimisations.

I view this as a very large potential upside. LiveDebugVariables is incredibly complex and is a huge compile-time bear. Defining away large chunks of its job would be really nice.

Would this mean that an equivalence class in LiveDebugVariables would consist exclusively of UserValues referring to the same variable (as opposed to including UserValues that share a vreg as well)?

Finally, this design could be extended to not having any instructions
in the instruction stream. Once machine locations aren't described
within a MachineOperand, the most important thing a DBG_VALUE
signifies is a position in the instruction stream, which could be
performed in some other way (i.e., more metadata) in the future.

How then do we translate this new kind of machine location into
DWARF/CodeView variable locations, which need to know which register
to look in? The answer is: LiveDebugValues [3]. We already perform a
dataflow analysis in LiveDebugValues of where values "go" after
they're defined: we can use that to take "defining instructions" and
determine register / stack locations. We would need to track values
from the defining instruction up to the DBG_VALUE where that value
becomes a variable location, after which it's no different from the
LiveDebugValues analysis that we perform today. LiveDebugValues'
ability to track values through stack spills and restores would become
a critical feature (it isn't today), as we would no longer generate
stack locations during register allocation.

Do you expect that handling for the current and new-style DBG_VALUEs could coexist in LiveDebugValues? Could that be done by e.g. introducing a new debug instr MI (DBG_INSTR_REF)?

I reckon debug-use-before-def's can be tolerated in this
representation, and even be well defined and useful, reducing the work
needed to be done earlier in the compiler. Under the model described
above, we can specify a program location before the corresponding
machine location containing the variable value machine location
becomes available. Consider this code:

DBG_VALUE output-of-this-inst ---
someinst1 |
someinst2 |
$rax = ADD32ri $rax, 0 <-----

Where the line from DBG_VALUE to ADD32ri represents some
as-yet-undetermined way of identifying the ADD32ri instruction from
the DBG_VALUE. We can interpret such a code sequence as the variable
having no location across someinst1 and someinst2, which are not
dominated by the defining instruction, then a location of $rax after
the ADD32ri. Essentially:
* For an instruction dominated by a DBG_VALUE but not by the defining
instruction, the variable location is empty / undef / $noreg,
* For an instruction dominated by both, the variable location is
defined as it is today.

This should work across control flow, and doesn't necessitate the
creation of DBG_VALUE $noreg's to explicitly describe unavailable
locations when instructions move. In theory, if we were to accept
debug use-before-defs in LLVM-IR, this would reduce analysis and mean
fewer dbg.value(undef,...)'s would need to be created earlier in the
compiler.

Limitations

The largest problem with this idea is that not all variable values are
defined by instructions: PHIs are values that are defined by control
flow. To deal with this pre-regalloc, we could move LiveDebugVariables
to run before phi-elimination.

I don’t really follow this. Are you suggesting stripping out debug values before phi elim, and replacing them after virtregrewrite? How would the re-inserted debug instrs refer to values produced by phis?

My understanding is that the register
allocation phase of LLVM starts there and ends after virtregrewriter,
and it'd be legitimate to say "we do special things for these passes".
After regalloc however, there would need to be some way of specifying
a block and a register, where entry to the block defines a variable
value in that register. This isn't pretty; but IMO is the
representation closest to the truth.

Oh, you answer this here. So LiveDebugVariables would need to figure out, for each debug-use-of-phi, 1) which block to put it in and 2) which register+variable gets defined. This is different enough from the new-style debug instr to potentially warrant its own instruction (DBG_PHI?).

Passes like tail duplication and
branchfolder might need to perform debuginfo maintenence when they
altered blocks -- however I believe these circumstances are rare, as
few control flow changes happen after regalloc. It (IMO) would be
worth it given the other benefits.

I also haven't considered the impact of this on -O0: one would imagine
it would be easier to deal with than optimised builds though.

Discussion

I feel like this would be a better way of representing variable
locations in the codegen backend; my fear is that this is a lot of
work, and I don't know what appetite there is for change amongst other
interested parties. Thus I'd be interested in any kind of feedback as
to whether a) this is a good idea, b) whether this category of change
is what people want, and c) whether this is seen as being achievable.

Imho this is a good idea and I’d like to see something like this happen. For it to ”really happen” we (Apple) would probably need a way to transition to the new representation incrementally (e.g. to toggle a flag to get back the old representation). I’m not yet sure about what all that would really entail.

Being able to introduce this change incrementally presents some
challenges: while the way of representing variable locations described
above is more expressive than the current way, converting between one
and the other requires running the LiveDebugValues analysis,

This sounds like you’re planning to change the existing DBG_VALUE instruction. Have you considered keeping it — unmodified — and introducing new debug instructions for the new semantics? The migration story for that seems a lot simpler.

which
makes moving transparently between the two hard to do. Moving
backwards through the backend, from emission towards the start might
be doable though.

This introduces some additional complexity into a pass
(LiveDebugValues) that's been difficult to understand and reason about
in the past. In my opinion, given that we have to perform this
dataflow analysis at the end of compilation to propagate variable
locations anyway, it would be worthwhile to harness it to remove the
need for complexity elsewhere.

If we can delete large parts of live debug variables this will be worth it imho. But perhaps part of the plan here should include splitting up live debug values into smaller files to simplify it?

Hi debuginfo cabal,

tl;dr: I'd like to know what people think about an alternative to
DBG_VALUE instructions describing variable locations in registers,
virtual or real. Before instruction selection in LLVM-IR we identify
the _values_ of variables [0] by the instruction that computes the
value; I believe we should be able to do the same post-isel, and it
would avoid having to analyse register locations across regalloc and
numerous optimisations. Or written another way: why don't we track the
value of variables through backend codegen, and then determine a
register location very late?

This is just an idea with no solid proposal of work. IMO this would
reduce the amount of code and complexity involved in preserving
variable locations. It would also help eliminate debug instructions in
a far flung future.

Background:

In optimised LLVM-IR, we specify a variable location like so:

%2 = someinst %1, %0
call @llvm.dbg.value(metadata i32 %2, ...)

A dbg.value intrinsic call specifies two things about a variable:
* The SSA-register / otherwise that is the value of the variable, and,
* The position in the instruction stream where that SSA-register
becomes the variable location.

I'm using the term "machine location" and "program location"
throughout this email to mean the two items above, respectively. This
representation is good for LLVM-IR: the SSA-register machine location
entirely and uniquely identifies a computation, the value of which
should appear as the value of the variable in a debugger.

Post-isel, the same sequence is represented by:

%2 = some-machine-inst %1, %0
DBG_VALUE %2, ...

Which to a large extent means the same thing. However, there are some
subtle differences that manifest as the function proceeds through the
codegen pipeline:
* The specified virtual register (%0) doesn't always contain the
value produced by "some-machine-inst". Once we leave SSA-form, there
can be multiple def's of the vreg after PHI-elimination / register
coalescing.
* The vreg does not uniquely identify the value produced by
"some-machine-inst": COPY instructions introduced during SelectionDAG
/ PHI-elimination / other passes place the value into multiple vregs,
that can have different liveness ranges.

The problem:

Those two differences between dbg.value intrinsics and DBG_VALUE
instructions introduce some annoying artifacts that make handling
DBG_VALUEs harder than dbg.values:
* Identical DBG_VALUEs at different program locations can result in
different variable values being presented (because their vreg operand
might refer to a different def),
* There can be multiple ways to represent a dbg.value in DBG_VALUEs
(as you have a choice of vregs from COPY instructions), some with
different lifetimes.

Both of which make the movement and preservation of DBG_VALUEs much
more context-dependent than the LLVM-IR equivalent. It's a lot easier
to cause an incorrect value to appear in a debugger at this stage of
compilation, or limit the range over which we preserve a variable
location.

There are currently three instruction scheduling passes in LLVM
(machine-scheduler, postra scheduler, SelectionDAG does some too)
which don't have any principled approach to preserving the correctness
of variable locations, and are vulnerable to the artifacts above. The
first two just glue DBG_VALUEs to the preceeding machine instruction
and move them around together (vulnerable to assignment reordering and
referring to the wrong {v,}reg def), the latter can re-order
assignments but also finds it hard to select the longest-living vreg,
which I wrote up in [1]. Correctly scheduling DBG_VALUEs to always:
* refer to the correct vreg def,
* With the longest lifetime,
* without re-ordering assignments,
is sufficiently hard that no-one has attempted it to my knowledge, and
I believe it would be really difficult to get right. Additionally, if
we were to generate DBG_VALUE $noreg instructions when rescheduling
(to terminate earlier variable locations), and then a subsequent
scheduling pass undoes that rescheduling (or some part of it), we will
lose or shorten variable locations for no reason.

Finally, being forced to always specify both the machine location and
the program location at the same time (in a single DBG_VALUE)
introduces un-necessary burdens. In MachineSink, when we sink between
blocks an instruction that defines a vreg, we chose to sink DBG_VALUE
instructions referring to that vreg too to avoid losing the variable
location. This un-necessarily risks re-ordering assignments, and in
some circumstances [2] you would have to examine all the instructions
in the function to work out whether sinking a DBG_VALUE would be
legal. In SimpleRegisterCoalescing, when we merge two vregs,
DBG_VALUEs can only refer to the surviving vreg -- and at the
DBG_VALUEs location that vreg might not contain the right def. There
may be other machine locations where the correct value is available
(it may even be rematerialized later), but searching for it is hard;
right now we just drop variable location information in these cases.

Makes sense so far.

A solution:

[To be clear, I haven't tried to implement this idea yet as I wanted feedback,]

I'd like to suggest that we can represent variable locations in the
codegen backend / MIR with three things:
* The instruction that defines the value of the variable,
* The operand of that instruction into which the value is written,
* The position in the instruction stream where the assignment of this
value to the variable occurs

What about constants and memory locations?

That's effectively modifying a machine location from being a {v,}reg,
into being a "defining instruction" and operand. This is closer to the
LLVM-IR form of a machine location, where the SSA Value and its
computation are synonymous. Exactly how this is represented in-memory
and in-printed-MIR I haven't thought a lot about; probably by
attaching metadata to instructions and having DBG_VALUE use a metadata
operand rather than referring to a vreg. Specifying machine locations
like this would have the following benefits:
* Both DBG_VALUEs and defining instructions are independent and can
be moved within the function without loss of information, and without
needing to consider so much context,

What is the difference between attaching the DBG_VALUE to the instruction and moving the DBG_VALUE together with the preceding non-debug instruction?

What do you do with code like this:

int a = x;
int b = 23;
...
b = a;

mov rax, %x
DBG_VALUE rax, "a"
DBG_VALUE 23, "b"
...
DBG_VALUE rax, "b"

where the "defining instruction" is far away from the DBG_VALUE?

-- adrian

Basically, I think this is a great idea.

Debug values effectively represent the original program location of an assignment (or PHI, but let’s ignore that for now) and the assigned value. Transforms should focus on preserving information about the assigned value, not the original program location of the assignment. Transforms should already know how to calculate the assigned value in terms of other values, but they shouldn’t have to know anything about variable assignments, control dependence, etc.

Adrian’s point about memory locations is interesting. We have always had the problem that transforms like DSE can result in “wrong” variable values in the memory location. But, I think this is an older problem that can be handled separately.

Hi Vedant, thanks for the detailed response,

> Finally, being forced to always specify both the machine location and
> the program location at the same time (in a single DBG_VALUE)
> introduces un-necessary burdens. In MachineSink, when we sink between
> blocks an instruction that defines a vreg, we chose to sink DBG_VALUE
> instructions referring to that vreg too to avoid losing the variable
> location. This un-necessarily risks re-ordering assignments,

So under the proposed scheme, would the dbg value not be sunk? Ah, so then you’d get the dbg use before def scenario, which you argue has some nice properties below.

Indeed -- I'm aware of previous efforts to suppress dbg-use-before-def
in the past, but I'm not aware of why it's not supported (it does
/feel/ wrong at the very least). Perhaps the reason why was because it
couldn't be described post-regalloc.

(As an aside, this would move us in the direction of making it easier
to change LLVM-IR instructions, but harder to determine what a
variable location is, through the middle stages of the compiler).

> I'd like to suggest that we can represent variable locations in the
> codegen backend / MIR with three things:
> * The instruction that defines the value of the variable,
> * The operand of that instruction into which the value is written,
> * The position in the instruction stream where the assignment of this
> value to the variable occurs
>
> That's effectively modifying a machine location from being a {v,}reg,
> into being a "defining instruction" and operand. This is closer to the
> LLVM-IR form of a machine location, where the SSA Value and its
> computation are synonymous. Exactly how this is represented in-memory
> and in-printed-MIR I haven't thought a lot about; probably by
> attaching metadata to instructions and having DBG_VALUE use a metadata
> operand rather than referring to a vreg. Specifying machine locations
> like this would have the following benefits:
> * Both DBG_VALUEs and defining instructions are independent and can
> be moved within the function without loss of information, and without
> needing to consider so much context,
> * Likewise, vregs can be rewritten / merged / deleted without the
> need to update any debug metadata. Only instruction deletion /
> morphing would need some sort of change,

For deletion, that makes sense, we might consider introducing an MI level salvageDI method to help with that.

For replacement/morphing, wouldn’t MachineInstr::RAUW be able to update the new-style debug uses?

I guess it would, on the assumption that a machineinstr having its
operands changed doesn't actually change the value(s) that it
computes.

> The three instruction scheduling passes would become significantly
> easier to deal with: they would only have to replace DBG_VALUE
> instructions in the correct order, not worry about their operands.
> Various debug facilities in SimpleRegisterCoalescing, MachineSink, and
> large amounts of LiveDebugVariables would become redundant, as we
> wouldn't need to maintain a register location through optimisations.

I view this as a very large potential upside. LiveDebugVariables is incredibly complex and is a huge compile-time bear. Defining away large chunks of its job would be really nice.

Would this mean that an equivalence class in LiveDebugVariables would consist exclusively of UserValues referring to the same variable (as opposed to including UserValues that share a vreg as well)?

It believe so -- although I can't say I have a complete understanding
of LiveDebugVariables' equivalence classes. That aspect of an
equivalence class could be replaced with the identity of the defining
instruction instead, but, I think the interval-splitting that
LiveDebugVariables does wouldn't be necessary any more and so
computing equivalence wouldn't be necessary either.

> How then do we translate this new kind of machine location into
> DWARF/CodeView variable locations, which need to know which register
> to look in? The answer is: LiveDebugValues [3]. We already perform a
> dataflow analysis in LiveDebugValues of where values "go" after
> they're defined: we can use that to take "defining instructions" and
> determine register / stack locations. We would need to track values
> from the defining instruction up to the DBG_VALUE where that value
> becomes a variable location, after which it's no different from the
> LiveDebugValues analysis that we perform today. LiveDebugValues'
> ability to track values through stack spills and restores would become
> a critical feature (it isn't today), as we would no longer generate
> stack locations during register allocation.

Do you expect that handling for the current and new-style DBG_VALUEs could coexist in LiveDebugValues? Could that be done by e.g. introducing a new debug instr MI (DBG_INSTR_REF)?

I believe it should be relatively straightforwards: we would have an
additional kind of location-record (i.e. VarLoc) that, instead of
identifying a _variable_, identified the instruction/operand that
computed a value. Handling a DBG_INSTR_REF as you suggest would mean
looking up such a VarLoc by instruction/operand to find its register,
then creating a register location VarLoc for the variable as we do
today. Propagating the location of an instruction/operand VarLoc would
happen in exactly the same way as variable locations today, it's just
following a value as it's moved around registers.

> Limitations
>
> The largest problem with this idea is that not all variable values are
> defined by instructions: PHIs are values that are defined by control
> flow. To deal with this pre-regalloc, we could move LiveDebugVariables
> to run before phi-elimination.

I don’t really follow this. Are you suggesting stripping out debug values before phi elim, and replacing them after virtregrewrite? How would the re-inserted debug instrs refer to values produced by phis?

> My understanding is that the register
> allocation phase of LLVM starts there and ends after virtregrewriter,
> and it'd be legitimate to say "we do special things for these passes".
> After regalloc however, there would need to be some way of specifying
> a block and a register, where entry to the block defines a variable
> value in that register. This isn't pretty; but IMO is the
> representation closest to the truth.

Oh, you answer this here. So LiveDebugVariables would need to figure out, for each debug-use-of-phi, 1) which block to put it in and 2) which register+variable gets defined. This is different enough from the new-style debug instr to potentially warrant its own instruction (DBG_PHI?).

Yes and no; I think 1) should be un-necessary, or at least not be
difficult in any way. The DBG_PHIs would refer to a block and
register, identifying the PHIs value at a single program location, the
start of the block where it is defined. This is (98% certain) easy to
identify in virtregrewriter with a location query, and avoids
considering live ranges and interval splitting. We wouldn't need to
try and know where the PHI value is later in the block or in other
blocks: the same LiveDebugValues mechanism as DBG_INSTR_REF would be
able to determine the PHI values register location at the position of
a later DBG_PHI instruction.

(Again, I've put little thought into how that would be represented in
memory or MIR).

> I feel like this would be a better way of representing variable
> locations in the codegen backend; my fear is that this is a lot of
> work, and I don't know what appetite there is for change amongst other
> interested parties. Thus I'd be interested in any kind of feedback as
> to whether a) this is a good idea, b) whether this category of change
> is what people want, and c) whether this is seen as being achievable.

Imho this is a good idea and I’d like to see something like this happen. For it to ”really happen” we (Apple) would probably need a way to transition to the new representation incrementally (e.g. to toggle a flag to get back the old representation). I’m not yet sure about what all that would really entail.

That's great to hear! When you say toggling a flag, is that a flag for
a whole compilation, or returning to the old representation at
specific passes? I don't think it'd be too difficult to have two
implementations in the codebase and pick one to use, switching between
the two mid-flight is a lot harder of course.

> Being able to introduce this change incrementally presents some
> challenges: while the way of representing variable locations described
> above is more expressive than the current way, converting between one
> and the other requires running the LiveDebugValues analysis,

This sounds like you’re planning to change the existing DBG_VALUE instruction. Have you considered keeping it — unmodified — and introducing new debug instructions for the new semantics? The migration story for that seems a lot simpler.

I hadn't thought about keeping it; you're right, it'd be a lot easier
(and a lot more testable) to do it that way. That also lines up with
having a toggle flag to allow incremental transition.

If we can delete large parts of live debug variables this will be worth it imho. But perhaps part of the plan here should include splitting up live debug values into smaller files to simplify it?

Splitting up LiveDebugValues sounds like a good plan, entry value
development has added a certain amount of complexity to it too.

Thanks for the feedback!

Hi Adrian,

Makes sense so far.

Great to hear,

> A solution:
>
> [To be clear, I haven't tried to implement this idea yet as I wanted feedback,]
>
> I'd like to suggest that we can represent variable locations in the
> codegen backend / MIR with three things:
> * The instruction that defines the value of the variable,
> * The operand of that instruction into which the value is written,
> * The position in the instruction stream where the assignment of this
> value to the variable occurs

What about constants and memory locations?

Good question, and something I've not done too much thinking about. As
far as I'm aware, memory references today are all register based, with
the DIExpression expressing any memory operations. That should
translate to the proposed model naturally: memory locations would be
instruction references with a suitable DIExpression qualifying the
value.

Constants are trickier; it's probably easiest to keep DBG_VALUE
instructions to describe constants. This shouldn't be limiting at all,
as constant-valued locations aren't tied to specific program locations
in the same way register locations are.

> That's effectively modifying a machine location from being a {v,}reg,
> into being a "defining instruction" and operand. This is closer to the
> LLVM-IR form of a machine location, where the SSA Value and its
> computation are synonymous. Exactly how this is represented in-memory
> and in-printed-MIR I haven't thought a lot about; probably by
> attaching metadata to instructions and having DBG_VALUE use a metadata
> operand rather than referring to a vreg. Specifying machine locations
> like this would have the following benefits:
> * Both DBG_VALUEs and defining instructions are independent and can
> be moved within the function without loss of information, and without
> needing to consider so much context,

What is the difference between attaching the DBG_VALUE to the instruction and moving the DBG_VALUE together with the preceding non-debug instruction?

The latter will change the program location at which the variable
assignment takes place, while the former does not. Whenever DBG_VALUEs
are moved, we have to consider how the move changes the lifetime of
the variable location, but under the proposed model we wouldn't have
to do this at all: DBG_INSTR_REFs would never be forced to move.

I think the reproducer in PR44117 illustrates the generalised problem.
The computation of "floogie" is sunk from the entry block to the end
block, we currently chose to sink the DBG_VALUE for "badgers" with it
-- but incorrectly. The end block was not dominated by either
assignment to "badgers" and the variable should be reported "optimised
out", however sinking a DBG_VALUE into that block alters variable
lifetimes by making (part of) it dominated by one assignment.
Identifying this problem when the movement happens means running a
dominance query of what instructions are dominated by which
DBG_VALUEs, which is expensive; it might even require dataflow
knowledge if loops are present (I'm unsure if this is true).

Referring to the instruction with a DBG_INSTR_REF is effectively
deferring this analysis until LiveDebugValues, and avoiding creating
additional artefacts (DBG_VALUE $noregs) along the way.

(I know "assignment" isn't agreed nomenclature, but for all intensive
purposes I believe that is how DBG_VALUEs are interpreted, a variable
location at an instruction is defined by the most recent DBG_VALUE
that dominates it).

What do you do with code like this:

int a = x;
int b = 23;
...
b = a;

mov rax, %x
DBG_VALUE rax, "a"
DBG_VALUE 23, "b"
...
DBG_VALUE rax, "b"

where the "defining instruction" is far away from the DBG_VALUE?

Essentially, leave the third DBG_VALUE unresolved and referring to the
'mov' until we reach LiveDebugValues; then follow the value the mov
writes to rax through any moves and spills (as LiveDebugValues does
today). If we can guarantee a location for the value produced by the
mov at the third DBG_VALUE, then we have a register location for that
DBG_VALUE. If we can't, it's interpreted as a DBG_VALUE $noreg,
because the value it's referring to has been optimised out at the
program location where the DBG_VALUE lies.

Thanks for the feedback!

Hi Vedant, thanks for the detailed response,

Finally, being forced to always specify both the machine location and
the program location at the same time (in a single DBG_VALUE)
introduces un-necessary burdens. In MachineSink, when we sink between
blocks an instruction that defines a vreg, we chose to sink DBG_VALUE
instructions referring to that vreg too to avoid losing the variable
location. This un-necessarily risks re-ordering assignments,

So under the proposed scheme, would the dbg value not be sunk? Ah, so then you’d get the dbg use before def scenario, which you argue has some nice properties below.

Indeed -- I'm aware of previous efforts to suppress dbg-use-before-def
in the past, but I'm not aware of why it's not supported (it does
/feel/ wrong at the very least). Perhaps the reason why was because it
couldn't be described post-regalloc.

(As an aside, this would move us in the direction of making it easier
to change LLVM-IR instructions, but harder to determine what a
variable location is, through the middle stages of the compiler).

I'd like to suggest that we can represent variable locations in the
codegen backend / MIR with three things:
* The instruction that defines the value of the variable,
* The operand of that instruction into which the value is written,
* The position in the instruction stream where the assignment of this
value to the variable occurs

That's effectively modifying a machine location from being a {v,}reg,
into being a "defining instruction" and operand. This is closer to the
LLVM-IR form of a machine location, where the SSA Value and its
computation are synonymous. Exactly how this is represented in-memory
and in-printed-MIR I haven't thought a lot about; probably by
attaching metadata to instructions and having DBG_VALUE use a metadata
operand rather than referring to a vreg. Specifying machine locations
like this would have the following benefits:
* Both DBG_VALUEs and defining instructions are independent and can
be moved within the function without loss of information, and without
needing to consider so much context,
* Likewise, vregs can be rewritten / merged / deleted without the
need to update any debug metadata. Only instruction deletion /
morphing would need some sort of change,

For deletion, that makes sense, we might consider introducing an MI level salvageDI method to help with that.

For replacement/morphing, wouldn’t MachineInstr::RAUW be able to update the new-style debug uses?

I guess it would, on the assumption that a machineinstr having its
operands changed doesn't actually change the value(s) that it
computes.

The three instruction scheduling passes would become significantly
easier to deal with: they would only have to replace DBG_VALUE
instructions in the correct order, not worry about their operands.
Various debug facilities in SimpleRegisterCoalescing, MachineSink, and
large amounts of LiveDebugVariables would become redundant, as we
wouldn't need to maintain a register location through optimisations.

I view this as a very large potential upside. LiveDebugVariables is incredibly complex and is a huge compile-time bear. Defining away large chunks of its job would be really nice.

Would this mean that an equivalence class in LiveDebugVariables would consist exclusively of UserValues referring to the same variable (as opposed to including UserValues that share a vreg as well)?

It believe so -- although I can't say I have a complete understanding
of LiveDebugVariables' equivalence classes. That aspect of an
equivalence class could be replaced with the identity of the defining
instruction instead, but, I think the interval-splitting that
LiveDebugVariables does wouldn't be necessary any more and so
computing equivalence wouldn't be necessary either.

This sounds really great.

How then do we translate this new kind of machine location into
DWARF/CodeView variable locations, which need to know which register
to look in? The answer is: LiveDebugValues [3]. We already perform a
dataflow analysis in LiveDebugValues of where values "go" after
they're defined: we can use that to take "defining instructions" and
determine register / stack locations. We would need to track values
from the defining instruction up to the DBG_VALUE where that value
becomes a variable location, after which it's no different from the
LiveDebugValues analysis that we perform today. LiveDebugValues'
ability to track values through stack spills and restores would become
a critical feature (it isn't today), as we would no longer generate
stack locations during register allocation.

Do you expect that handling for the current and new-style DBG_VALUEs could coexist in LiveDebugValues? Could that be done by e.g. introducing a new debug instr MI (DBG_INSTR_REF)?

I believe it should be relatively straightforwards: we would have an
additional kind of location-record (i.e. VarLoc) that, instead of
identifying a _variable_, identified the instruction/operand that
computed a value. Handling a DBG_INSTR_REF as you suggest would mean
looking up such a VarLoc by instruction/operand to find its register,
then creating a register location VarLoc for the variable as we do
today. Propagating the location of an instruction/operand VarLoc would
happen in exactly the same way as variable locations today, it's just
following a value as it's moved around registers.

I see, so the required addition to LiveDebugValues is basically about reconstituting VarLocs from new-style debug values? And otherwise, the basic range extension algorithm is unchanged?

Limitations

The largest problem with this idea is that not all variable values are
defined by instructions: PHIs are values that are defined by control
flow. To deal with this pre-regalloc, we could move LiveDebugVariables
to run before phi-elimination.

I don’t really follow this. Are you suggesting stripping out debug values before phi elim, and replacing them after virtregrewrite? How would the re-inserted debug instrs refer to values produced by phis?

My understanding is that the register
allocation phase of LLVM starts there and ends after virtregrewriter,
and it'd be legitimate to say "we do special things for these passes".
After regalloc however, there would need to be some way of specifying
a block and a register, where entry to the block defines a variable
value in that register. This isn't pretty; but IMO is the
representation closest to the truth.

Oh, you answer this here. So LiveDebugVariables would need to figure out, for each debug-use-of-phi, 1) which block to put it in and 2) which register+variable gets defined. This is different enough from the new-style debug instr to potentially warrant its own instruction (DBG_PHI?).

Yes and no; I think 1) should be un-necessary, or at least not be
difficult in any way. The DBG_PHIs would refer to a block and
register, identifying the PHIs value at a single program location, the
start of the block where it is defined. This is (98% certain) easy to
identify in virtregrewriter with a location query, and avoids
considering live ranges and interval splitting. We wouldn't need to
try and know where the PHI value is later in the block or in other
blocks: the same LiveDebugValues mechanism as DBG_INSTR_REF would be
able to determine the PHI values register location at the position of
a later DBG_PHI instruction.

(Again, I've put little thought into how that would be represented in
memory or MIR).

I feel like this would be a better way of representing variable
locations in the codegen backend; my fear is that this is a lot of
work, and I don't know what appetite there is for change amongst other
interested parties. Thus I'd be interested in any kind of feedback as
to whether a) this is a good idea, b) whether this category of change
is what people want, and c) whether this is seen as being achievable.

Imho this is a good idea and I’d like to see something like this happen. For it to ”really happen” we (Apple) would probably need a way to transition to the new representation incrementally (e.g. to toggle a flag to get back the old representation). I’m not yet sure about what all that would really entail.

That's great to hear! When you say toggling a flag, is that a flag for
a whole compilation, or returning to the old representation at
specific passes? I don't think it'd be too difficult to have two
implementations in the codebase and pick one to use, switching between
the two mid-flight is a lot harder of course.

I'm thinking of a flag for the whole compilation, like -Xclang <use the new debug instructions>. I don't think we'd need to switch between representations mid-flight.

Being able to introduce this change incrementally presents some
challenges: while the way of representing variable locations described
above is more expressive than the current way, converting between one
and the other requires running the LiveDebugValues analysis,

This sounds like you’re planning to change the existing DBG_VALUE instruction. Have you considered keeping it — unmodified — and introducing new debug instructions for the new semantics? The migration story for that seems a lot simpler.

I hadn't thought about keeping it; you're right, it'd be a lot easier
(and a lot more testable) to do it that way. That also lines up with
having a toggle flag to allow incremental transition.

If we can delete large parts of live debug variables this will be worth it imho. But perhaps part of the plan here should include splitting up live debug values into smaller files to simplify it?

Splitting up LiveDebugValues sounds like a good plan, entry value
development has added a certain amount of complexity to it too.

Thanks for the feedback!

--
Thanks,
Jeremy
_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
llvm-dev Info Page

<stitching together a response to a sub-thread with Adrian>

What about constants and memory locations?

Good question, and something I've not done too much thinking about. As
far as I'm aware, memory references today are all register based, with
the DIExpression expressing any memory operations. That should
translate to the proposed model naturally: memory locations would be
instruction references with a suitable DIExpression qualifying the
value.

For stack slots, I guess we'd mostly still rely on the MachineFunction side-table.

We do sometimes refer to stack locations using a register-based description, like:

  DBG_VALUE $rsp, 0, ![[SVAR]], !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_plus_uconst, 1)

Under the proposed new model, we could change this to a DBG_INSTR_REF pointing to the instruction which creates the memory location. I think this is how it's done in IR. An alternative might be to have the DBG_INSTR_REF point to the instruction that completes the store of the variable into the memory location. That doesn't sound as good to me. It doesn't seem like it'd be simple to work out the location from the instruction (e.g. if we're storing into an offset).

Constants are trickier; it's probably easiest to keep DBG_VALUE
instructions to describe constants. This shouldn't be limiting at all,
as constant-valued locations aren't tied to specific program locations
in the same way register locations are.

Seems reasonable to me.

vedant

Hi Vedant,

>> Do you expect that handling for the current and new-style DBG_VALUEs could coexist in LiveDebugValues? Could that be done by e.g. introducing a new debug instr MI (DBG_INSTR_REF)?
>
> I believe it should be relatively straightforwards: we would have an
> additional kind of location-record (i.e. VarLoc) that, instead of
> identifying a _variable_, identified the instruction/operand that
> computed a value. Handling a DBG_INSTR_REF as you suggest would mean
> looking up such a VarLoc by instruction/operand to find its register,
> then creating a register location VarLoc for the variable as we do
> today. Propagating the location of an instruction/operand VarLoc would
> happen in exactly the same way as variable locations today, it's just
> following a value as it's moved around registers.

I see, so the required addition to LiveDebugValues is basically about reconstituting VarLocs from new-style debug values? And otherwise, the basic range extension algorithm is unchanged?

Correct -- the dataflow analysis shouldn't require any kind of
algorithmic change, instead we're altering the position at which
values get tracked to be specified by a defining instruction, rather
than a DBG_VALUE. We're also adding two different flavours of
location:
* Values that are not yet variable locations
* Variable locations that don't yet have their value defined (use-before-def)
However I don't believe these two new flavours alter the algorithm,
they're just making different use of the information it produces. (96%
certain).

<stitching together a response to a sub-thread with Adrian>

>> What about constants and memory locations?
>
> Good question, and something I've not done too much thinking about. As
> far as I'm aware, memory references today are all register based, with
> the DIExpression expressing any memory operations. That should
> translate to the proposed model naturally: memory locations would be
> instruction references with a suitable DIExpression qualifying the
> value.

For stack slots, I guess we'd mostly still rely on the MachineFunction side-table.

We do sometimes refer to stack locations using a register-based description, like:

  DBG_VALUE $rsp, 0, ![[SVAR]], !DIExpression(DW_OP_constu, 8, DW_OP_minus, DW_OP_plus_uconst, 1)

Under the proposed new model, we could change this to a DBG_INSTR_REF pointing to the instruction which creates the memory location. I think this is how it's done in IR. An alternative might be to have the DBG_INSTR_REF point to the instruction that completes the store of the variable into the memory location. That doesn't sound as good to me. It doesn't seem like it'd be simple to work out the location from the instruction (e.g. if we're storing into an offset).

Hmmmm, this raises some interesting extra questions -- for example, I
don't believe there are instructions creating memory locations until
the PrologEpilog pass runs. An easy solution would be just to retain
DBG_VALUEs referring to frame-indexs, but that replicates some of the
flaws in referring to registers (multiple defs), although not to the
same extent. Offsets from the stack pointer aren't determined until
PrologEpilog too. Slightly more design work is probably needed here.

Hi llvm-dev@,

Time for an update on this work. The prototype I threw together seems
do be working well: I'm observing the sort of changes in variable
locations that I was expecting (more below). There's also new
location-maintenance required for several post-regalloc passes, which
adds new complexity to calculating variable locations, but I'm
confident that it'll be worth it.

Firstly, to illustrate the benefits, I've attached two IR files from
functions in clang-3.4. I haven't reduced them as the point is that
they're "real world" code. For sample1.ll, an output-ptr-argument in
%agg.result is GEP'd, the result of which is the operand of a
dbg.value and a store:

  entry:
    [...]
    %2 = getelementptr %agg.result, i64 0, i32 2 [...]
    [...]
    %.cast.i.i = bitcast %union.anon* %2 to i8*
    call void @llvm.dbg.value(i8* %.cast.i.i, ...)
    store i8 0, i8* %.cast.i.i, align 8
    [...]

Instruction scheduling during SelectionDAG chooses to place the
'store' immediately after the GEP. The vreg for the store address is
then dead at the point where the DBG_VALUE uses it, and the location
is dropped by LiveDebugVariables, missing about five instructions of
coverage where the value is available in a register. This is hard to
fix in today's representation, because register allocation doesn't
guarantee anything about positions where a register is dead (AFAIUI).
The instruction referencing work successfully tracks which instruction
computes the GEPs Value, and picks a register location when
LiveDebugValues runs.

sample2.ll has this "cleanup" block, where %Current.146 is GEPed twice:

cleanup:
  %Current.146 = phi %"class.llvm::Use"* [blah]
  [...]
  %incdec.ptr10 = getelementptr %Current.146, i64 1
  call void @llvm.dbg.value(%incdec.ptr10, [...]
  [...]
  %Val = getelementptr %Current.146, i64 1, i32 2
  %4 = load i64, i64* %Val
  %5 = trunc i64 %4 to i32, !dbg !1558
  [...]

This doesn't suffer from liveness problems, but jumping to immediately
after PHI elimination, we get (on amd64):

  %9:gr64 = COPY killed %34:gr64
  %10:gr64 = nuw ADD64ri8 %9:gr64(tied-def 0), 24
  DBG_VALUE %10:gr64, $noreg, !"this",
  %26:gr32 = MOV32rm killed %9:gr64, 1, $noreg, 40

The ADD64ri8 is the first GEP in the IR above, the MOV32rm is the
second GEP and load folded together. While rewriting the tied-def
instruction, the two-address-instruction pass sinks the ADD64ri8 to
reduce register pressure, producing this:
  %9:gr64 = COPY killed %34:gr64
  DBG_VALUE %10:gr64, $noreg, !"this",
  %26:gr32 = MOV32rm %9:gr64, 1, $noreg, 40
  %10:gr64 = nuw ADD64ri8 killed %9:gr64(tied-def 0), 24

Where the DBG_VALUE refers to %10 before it's defined. During register
coalescing, %9 and %10 are merged, after which the DBG_VALUE refers to
the result of the COPY, which has the wrong value. The instruction
referencing work drops this wrong location. It could instead be
recovered as a debug use-before-def, however I haven't completely
implemented that yet.

That's the good news, that this functionally appears to work. Some
random remarks:
* Variable coverage (as per llvm-locstats) is currently slightly
down, likely just due to bugs I've written in,
* LiveDebugVariables does appear to work without equivalence classes,
in this use-case it only needs to track a register at one SlotIndex,
* I haven't looked at compile times (yet),
* I don't currently have a feeling as to whether variable coverage
will end up better or worse: bad locations being correctly dropped
might out-number newly preserved locations.

The bad news is the increased analysis required. It turns out
basic-block placement often uses tail duplication; and, if a
duplicated block used to contain a PHI location, this destroys the
SSA-like form I was relying on as there's no single-definition point.
Happily the SSAUpdater utility can easily patch this up, but it could
be expensive, and it's uncomfortable to use so late in compilation.
More on this some other time.

The modifications to LiveDebugValues I mentioned work too; but it has
become a reaching-definition analysis rather than a simpler dataflow
one. This is because it needed to be able to identify PHI locations,
but also because this whole idea hinges on being able to track values
at the end of codegen, and coverage was not sufficient without
tracking _all_ the locations a value may be in. Independently of the
instruction referencing work, the stronger LiveDebugValues'
performance matches current performance when stage2-building clang.
I'll write about this in a different email, later.

In conclusion: it's looking like this will work. The first change that
could land would be the LiveDebugValues modifications, which has some
independent benefits. I'll write about that in a separate email.

sample1.ll.gz (23.7 KB)

sample2.ll.gz (22.2 KB)