DWARFExpression and DW_OP_addrx

Sorry for the delay in coming back to this, I had to make a small detour and go study how the whole debug map implementation works :slight_smile:

I’ve been debating the two proposed solutions:

  • Preemptively patching the debug_addr section when we load the oso file inside SymbolFileDWARFDebugMap.
  • On-the-fly patching the addresses returned by DWARFUnit::ReadAddressFromDebugAddrSection using some callback installed inside DWARFUnit.

However, both of those create an inconsistent story for users of DWARFExpression::GetLocation_DW_OP_addr. This function returns an address, without telling callers whether that address came from an OP_addr or an OP_addrx. If we preemptively fix OP_addrx results without doing the same for OP_addr results, then callers can’t know whether the address they got was an executable address or an object file address. In other words, they can’t know whether they need to call LinkOSOFileAddress on those results or not.

There is an even bigger problem here. Inside SymbolFileDWARF::ParseVariableDIE, there is a third use case for the GetLocation_DW_OP_addr / UpdateLocation_DW_OP_addr pair of functions. In particular, for external variables, we have this check:

if (is_external && location_DW_OP_addr ==0)

The idea being that “address 0 is used for external variables, so we need to do some magic”.
Here, we query the executable’s symbol table to figure out the real address of the variable based on its name, and then we call UpdateLocation_DW_OP_addr. If such a location came from an OP_addrx, we would be doing the wrong thing (the Update function is a no-op in this case, and we definitely don’t want to call LinkOSOFileAddress for these). This is not a hypothetical, it is what we will see if we compile lldb/test/API/lang/c/global_variables/a.c with DWARF 5.

With all that in mind, I am taken back to that very first idea I had in this thread, where the update function rewrites OP_addrx into OP_addr (⚙ D153891 [lldb] Consider OP_addrx in DWARFExpression::Update_DW_OP_addr). This seems to give a consistent story to the GetLocation / UpdateLocation functions.

Can you think of any other alternatives?