DWARFExpression and DW_OP_addrx

I’ve been working on improving LLDB’s support for DWARF 5, and I’m hitting an
issue with the new debug_addr section. In particular, it seems like the
DWARFExpression class 1 can’t handle the extra level of indirection provided
by the new op DW_OP_addrx.

Let’s consider a variable with a single location in DWARF 4:

DW_AT_location [DW_FORM_exprloc]    (DW_OP_addr <some_address>)

When LLDB needs to work on this variable, it creates a DWARFExpression on top
of a blob of data containing DW_OP_addr <some_address>. It may need to
perform two types of operations on top of this:

  • Read the address (DWARFExpression::GetLocation_DW_OP_addr) 2
  • Update the address (DWARFExpression::Update_DW_OP_addr(new_address)) 3

To update the address, it just replaces some_address with new_address in
that blob of data. This is all fairly straightforward.

Now consider what happens with the newly introduced DW_OP_addrx, which is an
index into an address table. For example:

DW_AT_location [DW_FORM_exprloc]    (DW_OP_addrx <some_index>)
...

debug_addr:
  0: addr0
  ...
  <some_index>: <some_address>
  ...

The DWARFExpression class knows how to read these addresses correctly, but
it does not know how to update them (see 3). Intuitively, this makes sense: its
blob of data only contains an index into some other read-only table. As it
stands, LLDB can’t work with these variables if updating an address is required.

ELF files seem to dodge this issue because when LLDB reads the debug_address
section, the addresses there are already “correct” (relocated, etc); as such,
the “update address” method is never called. This is not the case for MachO object
files in general, we do call the update address method (they also dodge the issue
when using dSYM bundles, as dsymutil currently rewrites DW_OP_addrx into DW_OP_addr).

Assuming you are on platform using MachO, you can repro the problem with:

echo 'auto myvar = 42; int main(){}' | clang -gdwarf-5 -x c++ - -c -o obj.o
clang obj.o -o main.out
lldb --batch -o "b main" -o "run" -o "v myvar" main.out

I see two alternatives to fix the issue, both involving changes to
Update_DW_OP_addr:

  1. Make the Update_DW_OP_addr method rewrite its blob of data so that it
    also rewrites the DW_OP_addrx opcode into DW_OP_addr. This would
    effectively change the location of the variable to be different from what is in
    the debug_info section, but it doesn’t seem to be a problem as far as I can
    tell. I have a prototype here 4.
    This could be expensive, since it involves copying buffers. However, note that: 1) We already do this for DW_OP_addr, 2) Variables that have an DW_OP_addr{x} are unlikely to have multiple locations, so these buffers are hardly ever bigger than
    9 bytes.
  2. Make DWARFExpression objects carry a map of index_in_debug_addr →
    real_address, which gets updated/read as needed by Update_DW_OP_addr /
    GetLocation_DW_OP_addr. I have a prototype here 5. This works but seems a
    bit overkill.

I would appreciate any thoughts / suggestions on this!

@clayborg I believe you worked on this part of LLDB back in the day? I would really appreciate your input here!

So we are seeing DW_OP_addrx is in DWARF5 without using split DWARF? Any reason we would need to used indexed addresses without having split DWARF? Does this reduce the number of relocations and thus it is becoming part of DWARF5 output? Seems like a waste of space if the address indexing isn’t needed.

I had another fix that wasn’t need in previous edits of this reply! Looks like DWARFUnit already has what we need. This fix can be applied to DWARFExpression::Evaluate(…):

case DW_OP_addrx: {
  if (!dwarf_cu) {
    if (error_ptr)
      error_ptr->SetErrorString("DW_OP_addrx requires a valid DWARFUnit object.");
    return false;
  }
  uint64_t addrx = opcodes.GetULEB128(&offset);
  lldb::addr_t addr = dwarf_cu->ReadAddressFromDebugAddrSection(addrx);
  if (addr == LLDB_INVALID_ADDR) {
    if (error_ptr)
      error_ptr->SetErrorString("DW_OP_addrx was unable to resolve the address index.");
    return false;
  }
  stack.push_back(Scalar(addr));
  stack.back().SetValueType(Value::ValueType::LoadAddress);
  }
  break;

Yes, indexed addressing reduces relocations for debug info - but can also just reduce size in general because you can reuse addresses. (eg: a simple CU with two functions can have 2 addresses in the address pool (start of function 1, start of function 2 - both can be referenced from the CU debug_rnglist when it’s describing the code ranges for the CU) instead of 4 (one each for the function low_pcs as before, but then 2 more for the addresses in the rnglist for the CU))

3 Likes

Thanks for the input!

This fix can be applied to DWARFExpression::Evaluate(…):

If I understand correctly, the suggested fix proposes changing the existing code in DWARFExpression::Evaluate to unconditionally set the ValueType of the address to be LoadAddress? Is the intent here that a LoadAddress needs to go through the same mapping done SymbolFileDWARFDebugMap::LinkOSOFileAddress before being accessed?

I gave this idea a try, but this does not seem to have any effects on the Evaluate method for the example in the original post. I’ll look a bit closer to see what’s going on.

To elaborate on my previous point, SymbolFileDWARF::ParseVariableDIE calls SymbolFileDWARFDebugMap::LinkOSOFileAddress and attempts to update the address inside the expression:

   3534           const lldb::addr_t exe_file_addr =
   3535               debug_map_symfile->LinkOSOFileAddress(this, location_DW_OP_addr);
   3536           if (exe_file_addr != LLDB_INVALID_ADDRESS) {
   3537             // Update the file address for this variable
   3538             DWARFExpression *location =
   3539                 location_list.GetMutableExpressionAtAddress();
   3540             location->Update_DW_OP_addr(die.GetCU(), exe_file_addr);

(and the call to Update_DW_OP_addr does nothing today)

Sorry that should be set to “Value::ValueType::FileAddress”. The intent is that a FileAddress needs to go through the same mapping done SymbolFileDWARFDebugMap::LinkOSOFileAddress before being accessed as the address in the .debug_addr section will be a virtual address from the file, and the binary may be slid at runtime.

Update_DW_OP_addr is only called to modify a DWARF location expression that is incorrect in the byte encoding. Here the byte encoding of the addrx is ok, so it doesn’t need to be fixed in Update_DW_OP_addr.

If each compile unit is relying on its own .o file that contains DWARF with a .debug_addr section (no dSYM file), then we will need to have each OSO file run through all addresses in .debug_addr section and modify them in place. This is because each .debug_addr section in each .o file will have unlinked addresses. We will need to run through and call SymbolFileDWARFDebugMap::LinkOSOFileAddress for each address in the .debug_addr section of each OSO file. Then when we access the addresses by index from the .debug_addr section, they will be correct. And if SymbolFileDWARFDebugMap::LinkOSOFileAddress returns LLDB_INVALID_ADDR, we still fixup the address in .debug_addr as the function might have been dead stripped.

1 Like

Sorry that should be set to “Value::ValueType::FileAddress”. The intent is that a FileAddress

All of this makes sense to me, however note that today we are already setting this to FileAddress. Apologies if I’m missing something here, but this is the current code of DWARFExpression::Evaluate:

    case DW_OP_addrx:
    case DW_OP_GNU_addr_index: {
      if (!dwarf_cu) {
        if (error_ptr)
          error_ptr->SetErrorString("DW_OP_GNU_addr_index found without a "
                                    "compile unit being specified");
        return false;
      }
      uint64_t index = opcodes.GetULEB128(&offset);
      lldb::addr_t value = dwarf_cu->ReadAddressFromDebugAddrSection(index);
      stack.push_back(Scalar(value));
      if (target &&
          target->GetArchitecture().GetCore() == ArchSpec::eCore_wasm32) {
        stack.back().SetValueType(Value::ValueType::LoadAddress);
      } else {
        stack.back().SetValueType(Value::ValueType::FileAddress); // <<<< Here!
      }
    } break;

In fact, the call to DWARFExpression::Evaluate succeeds. I followed the code paths a bit more to figure out where we go wrong; ValueObjectVariable::UpdateValue calls DWARFExpression::Evaluate and then it does:

   220          if (value_type == Value::ValueType::FileAddress && process_is_alive)
-> 221            m_value.ConvertToLoadAddress(GetModule().get(), target);

Which boils down to a call to Address::ResolveAddressUsingFileSections:

   248  bool Address::ResolveAddressUsingFileSections(addr_t file_addr,
   249                                                const SectionList *section_list) {
   250    if (section_list) {
   251      SectionSP section_sp(
   252          section_list->FindSectionContainingFileAddress(file_addr));
-> 253      m_section_wp = section_sp;

(lldb) p section_sp->GetName()
(lldb_private::ConstString) $2 = (m_string = "__PAGEZERO")
(lldb) v file_addr
(lldb::addr_t) file_addr = 8

The file_addr = 8 is correct, this is the location of the variable according to the
debug_addr section (in the original example of this thread, some_address == 8).
However, the section returned by ResolveAddressUsingFileSections is obviously
wrong (“pagezero”).

Yes, it is grabbing an address from a .debug_addr that has not been linked. When a OSO file is loaded, we need to run a link phase on every address of the .debug_addr section and run it through SymbolFileDWARFDebugMap::LinkOSOFileAddress. So this should be done when we first load the OSO file in SymbolFileDWARFDebugMap. We load the .debug_addr section as read only currently, so we will need to make a copy of the .debug_addr data so we can modify it, then we will need to link all addresses in the .debug_addr copy, then we will need find a way for the DWARFUnit to use this newly fixed up data, maybe using an accessor like DWARFUnit::SetLinkedAddressData(…) and the DWARFUnit will cache the fixed up addresses and use this data if it has been specified when we call DWARFUnit::ReadAddressFromDebugAddrSection.

Another possibly more efficient way to do this would be to add an accessor to DWARFUnit that can specify a callback to use to link any addresses obtained from .debug_addr. Only the SymbolFileDWARFDebugMap would use this. If there is a re-link address callback installed in each DWARFUnit, then when we extract an address from DWARFUnit::ReadAddressFromDebugAddrSection(…) it will call to remap the address on the fly.

No one is using addrx with unlinked .debug_addr except for DWARF in .o files currently. As the .debug_addr sections are all copied into the main executable for fission, and then those addresses are relinked and each compile unit has an attribute that points to the offset within the .debug_addr section from the main executable. This won’t happen on DWARF in .o files on mac since we rely only on the .o files.

Thank you! You’ve given me pointers of where to look at next, so I’ll give those ideas a try!

1 Like

Sorry for the delay in coming back to this, I had to make a small detour and go study how the whole debug map implementation works :slight_smile:

I’ve been debating the two proposed solutions:

  • Preemptively patching the debug_addr section when we load the oso file inside SymbolFileDWARFDebugMap.
  • On-the-fly patching the addresses returned by DWARFUnit::ReadAddressFromDebugAddrSection using some callback installed inside DWARFUnit.

However, both of those create an inconsistent story for users of DWARFExpression::GetLocation_DW_OP_addr. This function returns an address, without telling callers whether that address came from an OP_addr or an OP_addrx. If we preemptively fix OP_addrx results without doing the same for OP_addr results, then callers can’t know whether the address they got was an executable address or an object file address. In other words, they can’t know whether they need to call LinkOSOFileAddress on those results or not.

There is an even bigger problem here. Inside SymbolFileDWARF::ParseVariableDIE, there is a third use case for the GetLocation_DW_OP_addr / UpdateLocation_DW_OP_addr pair of functions. In particular, for external variables, we have this check:

if (is_external && location_DW_OP_addr ==0)

The idea being that “address 0 is used for external variables, so we need to do some magic”.
Here, we query the executable’s symbol table to figure out the real address of the variable based on its name, and then we call UpdateLocation_DW_OP_addr. If such a location came from an OP_addrx, we would be doing the wrong thing (the Update function is a no-op in this case, and we definitely don’t want to call LinkOSOFileAddress for these). This is not a hypothetical, it is what we will see if we compile lldb/test/API/lang/c/global_variables/a.c with DWARF 5.

With all that in mind, I am taken back to that very first idea I had in this thread, where the update function rewrites OP_addrx into OP_addr (⚙ D153891 [lldb] Consider OP_addrx in DWARFExpression::Update_DW_OP_addr). This seems to give a consistent story to the GetLocation / UpdateLocation functions.

Can you think of any other alternatives?

Hi @clayborg, just wanted to ping you here in case you had any other thoughts on this!

DW_OP_addrx shouldn’t ever return 0 for global variable addresses, so this shouldn’t be a problem. You can repro this zero address case with the following code in one file:

int global = 0;

And this in another file:

extern int global;

This will cause the zero address for a global in the second file with the “extern” so you can test.

It would be fine to rewrite the location description to not contain DW_OP_addrx and only contain DW_OP_addr with the value of the address from the .debug_addr section, then we always need to link the OSO addresses consistently. So that does sound like a good plan after your study of the current implementation.

How does that create a zero address? That creates a variable whose value is 0, not its address.

I will polish that patch and put it up for review!

This will cause the zero address for a global in the second file with the “extern” so you can test.

How does that create a zero address? That creates a variable whose value is 0, not its address.

Given the solution proposed, this is no longer an issue. But it’s an interesting tangent to discuss!
I also agree with @jrtc27; in my testing, I could not find a way to create any such variables with address zero. extern variables are simply not emitted in the debug info section.

FWIW the patch that introduced that code in LLDB is 2fc93eabf7e132abd51d0ea0ad599beb3fa44334. This is really old stuff, but looking into the original bug report the only way to get multiple variables with address zero is to:

  1. Use clang version 9 or before
  2. Use uninitialized global variables, like int x;

I got builds of (upstream) clang 9 and clang 11. With 9, we use address zero. With 11, we actually give non-zero addresses to those variables. So something changed between those two versions in how code is generated for those variables.

There is a good chance that that piece (*) of LLDB is no longer used and could be deleted (assuming we don’t care too much about code generated by very old versions of clang). But a lot more investigation would be needed.

(*) By that piece, I mean if (is_external && location_DW_OP_addr ==0). I actually refactored it into a helper function to simplify the code a bit: ⚙ D155100 [lldb][NFC] Factor out code linking OSO addr of uninitialized GVs

Small test NFC refactor to simplify the main patch: ⚙ D155197 [lldb][NFC] Refactor test to enable subsequent reuse

The main patch: ⚙ D155198 [lldb] Consider OP_addrx in DWARFExpression::Update_DW_OP_addr

Probably you can do it with a linker script or -Wl,–defsym=foo=0.

We can also force this with -fcommon. This seems disabled by default.

Common symbols are the way to make this happen, yes. If you look at the debug map for the N_GSYM it will be zero in the file that uses “extern int global;”, and dsymutil and the debugger know to find the non STAB symbol table entry with the same name to find the real address.