DEBUG INFO: improve handling of DBG_VALUEs and DebugLocs in CodeGen

Hi, (Resent with proper subject line)

I recently found myself in trouble because the crash I had disappeared
with -g, so I could not debug the program. This happened because the
optimizer did not remember to consider DBG_VALUEs instruction so it
changed its behavior, and the bug went hiding.

I then started discussing this onhttps://reviews.llvm.org/D45878, and
since this is something that should be handled by all different machine
passes, the immediate question is now what utility functions should be
made available for common use.

1. A pass such as MachineSink.cpp must first of all remember to consider
DBG_VALUEs when iterating over instructions, so that it does not e.g.
"break if next instruction is not X" and therefore change its results
with -g.

2. At a second priority,

  a) it should handle DBG_VALUEs when moving / erasing MachineInstrs.

b) it should " // Merge or erase debug location to ensure
consistent stepping in profilers and debuggers." (MachineSink.cpp).

Well, this is in fact the first point here: I think this should be
documented somewhere by a comment so that it is clear exactly how these
things should be handled. My personal understanding right now is that 2
is not vital but should be done on a best-effort basis. 1 however is
really bad if it happens (which it currently does).

We are discussing where to place these utility functions, and if
splice() should optionally be made to do this also. It seems that
collectDebugValues() in MachineSink.cpp is a good starting point, but we
probably want to do an even better search than just looking at the first
def operand.

It would be nice to get some general feedback from the community at this
point so we know which direction to take...

/Jonas

Hi, (Resent with proper subject line)

I recently found myself in trouble because the crash I had disappeared
with -g, so I could not debug the program. This happened because the
optimizer did not remember to consider DBG_VALUEs instruction so it
changed its behavior, and the bug went hiding.

[Here are some (very high-level) thoughts.]

As you suspect, that's clearly a bug. LLVM's ideal behavior is that the presence of debug informations shall not affect the contents of the .text section. As you also noticed, we're not quite there yet.

I then started discussing this on ⚙ D45878 [DEBUG INFO] Fixing cases where debug info (-g) causes changes in the program., and
since this is something that should be handled by all different machine
passes, the immediate question is now what utility functions should be
made available for common use.

1. A pass such as MachineSink.cpp must first of all remember to consider
DBG_VALUEs when iterating over instructions, so that it does not e.g.
"break if next instruction is not X" and therefore change its results
with -g.

We started a similar discussion at the LLVM IR level very recently. As a result of this we added new iterators to that make it easy to skip of debug intrinsics when doing analysis passes...

2. At a second priority,

a) it should handle DBG_VALUEs when moving / erasing MachineInstrs.

... but, skipping them during analysis is not enough, they also must move when the value they are describing moves.

b) it should " // Merge or erase debug location to ensure
consistent stepping in profilers and debuggers." (MachineSink.cpp).

Well, this is in fact the first point here: I think this should be
documented somewhere by a comment so that it is clear exactly how these
things should be handled. My personal understanding right now is that 2
is not vital but should be done on a best-effort basis. 1 however is
really bad if it happens (which it currently does).

Loosing DBG_VALUEs is bad because local variables will become unavailable in the debugger. This is completely unacceptable when compiling unoptimized code. In optimized code the expectations are somewhat lower, but there is a lot of low-hanging fruit left to pick where LLVM clearly could do a better job of preserving DBG_VALUEs.

Incorrect debug locations are also unacceptable in unoptimized code, since they will cause the debugger's single-stepping to behave unpredictably, thus destroying the debugging experience. In optimized code, the expectations are a bit lower, but *incorrect* debug locations are terrible since they will cause misleading crash traces and otherwise undebuggable code. Due to the nature of what an optimizing compiler does, debug information is not always salvageable, but the correct thing to do in this case is to remove the debug information (e.g., by inserting line: 0 locations) instead of providing incorrect or partially correct information.

We are discussing where to place these utility functions, and if
splice() should optionally be made to do this also. It seems that
collectDebugValues() in MachineSink.cpp is a good starting point, but we
probably want to do an even better search than just looking at the first
def operand.

It would be nice to get some general feedback from the community at this
point so we know which direction to take...

Thank you for starting this discussion! We should do what we can to make easy for pass authors to do the right thing. We should also do what we can to force pass authors to have to thing about debug information and how to update it correctly, even if that is annoying :slight_smile:

Would creating a MIR version of the debugify IR pass be helpful in finding more of the existing bugs?

-- adrian

Hi, (Resent with proper subject line)

I recently found myself in trouble because the crash I had disappeared
with -g, so I could not debug the program. This happened because the
optimizer did not remember to consider DBG_VALUEs instruction so it
changed its behavior, and the bug went hiding.

[Here are some (very high-level) thoughts.]

As you suspect, that’s clearly a bug. LLVM’s ideal behavior is that the presence of debug informations shall not affect the contents of the .text section. As you also noticed, we’re not quite there yet.

I then started discussing this on https://reviews.llvm.org/D45878, and
since this is something that should be handled by all different machine
passes, the immediate question is now what utility functions should be
made available for common use.

  1. A pass such as MachineSink.cpp must first of all remember to consider
    DBG_VALUEs when iterating over instructions, so that it does not e.g.
    “break if next instruction is not X” and therefore change its results
    with -g.

We started a similar discussion at the LLVM IR level very recently. As a result of this we added new iterators to that make it easy to skip of debug intrinsics when doing analysis passes…

  1. At a second priority,

a) it should handle DBG_VALUEs when moving / erasing MachineInstrs.

… but, skipping them during analysis is not enough, they also must move when the value they are describing moves.

A starting point for diagnosing issues which arise when MI’s aren’t moved when needed might be a use-before-def verifier at the MIR level (related: https://reviews.llvm.org/D46100).

b) it should " // Merge or erase debug location to ensure
consistent stepping in profilers and debuggers." (MachineSink.cpp).

Well, this is in fact the first point here: I think this should be
documented somewhere by a comment so that it is clear exactly how these
things should be handled. My personal understanding right now is that 2
is not vital but should be done on a best-effort basis. 1 however is
really bad if it happens (which it currently does).

Loosing DBG_VALUEs is bad because local variables will become unavailable in the debugger. This is completely unacceptable when compiling unoptimized code. In optimized code the expectations are somewhat lower, but there is a lot of low-hanging fruit left to pick where LLVM clearly could do a better job of preserving DBG_VALUEs.

Incorrect debug locations are also unacceptable in unoptimized code, since they will cause the debugger’s single-stepping to behave unpredictably, thus destroying the debugging experience. In optimized code, the expectations are a bit lower, but incorrect debug locations are terrible since they will cause misleading crash traces and otherwise undebuggable code. Due to the nature of what an optimizing compiler does, debug information is not always salvageable, but the correct thing to do in this case is to remove the debug information (e.g., by inserting line: 0 locations) instead of providing incorrect or partially correct information.

We are discussing where to place these utility functions, and if
splice() should optionally be made to do this also.

This sounds like a good idea to me.

It seems that
collectDebugValues() in MachineSink.cpp is a good starting point, but we
probably want to do an even better search than just looking at the first
def operand.

thanks,
vedant

Thanks for the replies!

What is the next step now? Refactor out some method(s) (perhaps from MachineSink) into utility functions so that they can be used elsewhere?

Who wants to write a comment somewhere to summarize these points?

/Jonas