RFC: Introduce DW_OP_LLVM_memory to describe variables in memory with dbg.value

Hi Reid,

Thanks for taking this on, I’m very pleased to see improvements related to debug info for optimized code. (You can cc me on code reviews, although I’m sure a lot of the patches will be in areas I am not very familiar with.)

While I have a really good handle on the DWARF standard, and have done a bunch of work with the type stuff, my understanding of IR mechanics is pretty naïve, so I’d appreciate any explanations that help me understand why the following might be really lame.

In optimized code, for things like the address-taken case, does the alloca survive? Assuming it does, can we attach the DIVariable metadata to the alloca instead of having a separate dbg.declare? (It has always seemed to me that this would make some things a lot simpler, as you don’t have to troll around looking for that other instruction, use-lists aren’t special cased for debug info instructions, and probably other things.)

If a memory-homed variable retains its alloca and the alloca retains its metadata, then it seems like it should be straightforward to produce that memory address as the default location for the variable.

And if we’re in the habit of looking at metadata on normal instructions for DIVariables instead of having dbg.value instructions, then maybe we don’t need dbg.value either.

Thanks,

–paulr

While I have a really good handle on the DWARF standard, and have done a
bunch of work with the type stuff, my understanding of IR mechanics is
pretty naïve, so I'd appreciate any explanations that help me understand
why the following might be really lame.

In optimized code, for things like the address-taken case, does the alloca
survive? Assuming it does, can we attach the DIVariable metadata to the
alloca instead of having a separate dbg.declare? (It has always seemed to
me that this would make some things a lot simpler, as you don't have to
troll around looking for that other instruction, use-lists aren't special
cased for debug info instructions, and probably other things.)

If a memory-homed variable retains its alloca and the alloca retains its

metadata, then it seems like it should be straightforward to produce that
memory address as the default location for the variable.

I think if I were redesigning LLVM, I would go even further and merge
DILocalVariable and alloca. =) Functions should really just have
"variables" that live in memory and can be accessed from any basic block.
After SSA promotion, if a variable has no uses that require it to live in
memory, we simply wouldn't allocate space for it. I definitely don't plan
to do that, though.

In today's LLVM, dbg.declare does serve one useful purpose: it marks the
point of declaration of the variable. We can use it to power
DW_AT_start_scope, so that users won't see uninitialized variables that are
in scope in this example:
  int x = f();
  // break here, 'info locals' prints a garbage y because it is in scope
  int y = f();

If we don't think we'll ever do DW_AT_start_scope, then yes, we could
probably use variable attachments instead of dbg.declare. But I think we
want to go the other direction and standardize on dbg.value.

And if we're in the habit of looking at metadata on normal instructions
for DIVariables instead of having dbg.value instructions, then maybe we
don't need dbg.value either.

We definitely need something like dbg.value. For variables that can be
fully promoted to SSA values, we need dbg.value to record in the debug info
that a source-level assignment occurred at this particular program point.
mem2reg completely erases the assigning instruction, so we need some kind
of placeholder. For variables that cannot be fully promoted, passes like
DSE should make an effort to record in the debug info that an assignment
occurred even if the store was deleted.

It's that concept of a "program point" that I don't think we can replace
with instruction metadata attachments. Today's LLVM instructions move
around too much to represent that.

Thanks for reading!

Hi Reid,

Good point about dbg.value being a marker for the source-level semantic of an assignment, even when the value is computed at some distinctly different point in the execution and the store is erased. My previous compiler was not SSA-based and we had something to hang the debug info on, even if no actual store occurred. In LLVM IR you do need this sort of “artificial use” marker.

Thanks!

–paulr