Concatenating DWARF location expressions

Hello,

Could someone explain to me what the rules are for concatenation of DWARF location expressions?

For example in lib/CodeGen/PrologEpilogInserter.cpp there is a call to DIExpression::prepend to concatenate the stack slot address of a variable to an existing expression. Now the problem here is that the former is a ‘Memory location description’ while the latter could very well be a ‘Implicit location description’ (e.g. the sign extension from lib/Transforms/Utils/Local.cpp) in which case a DW_OP_deref of some sort would be needed in between.

To me it seems that there is no clear way of knowing what to do when concatenating because there is no clear way of knowing the kind of the existing expression (is it Memory, Register or Implicit).

Without having thought too much about it it would seem that having additional meta ops in the expression for the sole purpose of identifying what kind it is would be helpful when concatenating.

I could find similar topics being raised on the list in the past but was not able to find any clear conclusion.

Thanks!

-Markus

Back when I added the prepend functionality I did so under the assumption that since the expressions are a stack-based postfix-notated language we can always prepend new operands without having to pay attention to what comes later in the expression. Back then LLVM didn't yet know about DW_OP_stack_value.

We don't distinguish DWARF location kinds in LLVM IR because it is not known where an LLVM SSA value will end up. Unfortunately we don't stick to that rule, since we do emit DW_OP_stack_values in various places, thus turning expressions definitely into implicit location descriptions. We still don't and can't distinguish between memory and register locations.

Since it sounds like the problem is only with implicit descriptions, would a rule such as "if the expression has a DW_OP_stack_value, add an extra DW_OP_deref" work for the PrologueEpilgueInserter or do we need something more principled?

-- adrian

From: aprantl@apple.com <aprantl@apple.com>

Since it sounds like the problem is only with implicit descriptions, would a rule
such as "if the expression has a DW_OP_stack_value, add an extra
DW_OP_deref" work for the PrologueEpilgueInserter or do we need
something more principled?

Right, that could very well be sufficient for this particular case so I will experiment with that. In general though and for the future it seems that it would be of benefit to have additional metadata in the expressions indicating if they describe e.g. a memory address, a value or whatnot. This would perhaps allow us to implement a simple debug expression verifier.

Thanks!
-Markus

From: aprantl@apple.com <aprantl@apple.com>

Since it sounds like the problem is only with implicit descriptions, would a rule
such as "if the expression has a DW_OP_stack_value, add an extra
DW_OP_deref" work for the PrologueEpilgueInserter or do we need
something more principled?

Right, that could very well be sufficient for this particular case so I will experiment with that. In general though and for the future it seems that it would be of benefit to have additional metadata in the expressions indicating if they describe e.g. a memory address, a value or whatnot.

That is more a property of what the expression is bound to (via a llvm.dbg.* intrinsic, a DIGlobalVariablesExpression) than of the DIExpression. The only thing we can say by inspecting the DIExpression alone is when it would have to be lowered into an implicit location description (because of the DW_OP_stack_value). We could easily add a `bool isImplicit()` member to DIExpression that returns true if a DW_OP_stack_value is present.

Do you think it makes sense to distinguish between memory and register addresses at the IR metadata level already?

This would perhaps allow us to implement a simple debug expression verifier.

What kind of properties would you like to verify?

-- adrian

Do you think it makes sense to distinguish between memory and register
addresses at the IR metadata level already?

I think it would make sense to be a bit stricter with types in debug expressions in general and if there is, as an example, an expression that derefs something then clearly keep track of the type of the address and the type of the value behind that address. I am sure that most of the time it is not really necessary and the type can be picked up by looking at the intrinsic it is bound to etc but I can also imagine that there are cases where this would not be sufficient.

What kind of properties would you like to verify?

E.g. that one doesn't concatenate expressions with incompatible types or that one does not insert a DW_OP_deref(_size) of wrong size.

My exposure to the debug framework is quite limited so I don’t really know for sure, it is mostly opinions based on my gut feeling :blush:

-Markus