Clarify HostAddress/LoadAddress logic

I’m trying to debug small c++ program which uses STL containers, let’s take a list for example:


#include <list>
#include <iostream>

using namespace std;

template<typename T>
void print(T& container) {
    for (const auto& v: container) {
        cout << v << ' ';
    }
    cout << endl;
}

void simple_list() {
    list<int> l{2048};
    print(l);
}

int main() {
    simple_list();
    return 0;
}

I’m trying to change list value (from 2048 to any other number) through vscode (lldb-mi) and python scripting in lldb - in both cases I see that this change is reverted after each next line command (please take a look at attached gif). Reproduction steps:

  1. Set breakpoint
  2. Stop at this breakpoinnt, change value of the list element
  3. VSCode shows updated value
  4. Execute next line command in lldb
  5. VSCode shows original value of the list element

I start digging into whats happened and find out that there are different types of ValueTypes for synthetic values in lldb ( https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lldb/include/lldb/Core/Value.h#L41 ), in my case the problem was in HostAddress / LoadAddress - in a comment it is stated that HostAddress is for values created in process which uses liblldb (lldb / lldb-mi) and LoadAddress is for values in debugging process (as I understand, please correct me if I wrong).
When program starts, the list’s values are marked as LoadAddress, but when I change them in VSCode - their type became HostAddress.
This happens in ListFrontEnd::GetChildAtIndex ( https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp#L392 ), and when value is changed, liblldb creates new object in its own memory with type HostAddress.
After that lldb-mi (after each step) sends delete-all-variables request to the debug server ( https://github.com/lldb-tools/lldb-mi/blob/main/src/MICmdCmdVar.cpp#L669 ), and after that on each step lldb-mi requests variables from debugging process’s frame ( https://github.com/lldb-tools/lldb-mi/blob/main/src/MICmdCmdStack.cpp#L850 ).
Therefore all HostAddress values are removed and all changes are lost. I didn’t yet dig into lldb, but I suppose something similar happens there.

So, the question is : is it expected that liblldb frontends (lldb, lldb-mi, etc) should manage HostAddress values properly and save values between steps? Or it should be done somehow somewhere in lldb? At this point it is unclear how to handle such values. If someone can provide helpful information for this - I would really appreciate it (maybe there are some good resources about how lldb works internally)

Note: in this question I took a list only as an example, but same issue happens for all STL containers except map/set (they use same synthetic frontend) which were fixed recently: https://reviews.llvm.org/D140624 (similar fix is applicable for most other containers, but I want to clarify how it was supposed to work initially, maybe properly it should be done in some another way)

stl

I would expect the updated value to be written to the target, and not stored on the host. After the step, I’d expect the new value to be displayed.

@jingham this feels a lot like the vector register update issue we talked about a little while ago, where updating the SBValue containing a vector register (and children containing the individual bytes) didn’t update the live value.

First note that if I augment your example:

void simple_list() {
  list<int> l{2048};
  int raw_arr[10] = {2048};
  print(l);
  print(raw_arr);
}

and then do the same steps to raw_arr[0], changing its value to 10, you will see that change IS preserved in the target over the step.

The problem here is that the ValueObjects representing the list [0], [1], etc elements aren’t ValueObjects that are directly backed by variables or variable structure elements in the program. They are “synthetic children” which we cook up to give a useful presentation of the list. The synthetic child contract doesn’t require a direct relationship between synthetic children and the underlying program entities they represent. Their job is just to produce a human readable presentation of the underlying type. So there isn’t a generic way lldb can know how a change to a synthetic child should effect the data structure it’s actually representing, there needs to be some cooperation from the synthetic child provider.

This is a long-standing issue which has been waiting for someone to clean it up. We need to add to the synthetic child interface a “change value” which would go back to the synthetic child provider and ask it if it knows how to change the underlying value that the synthetic child represents. There’s no requirement that the provider has to know how to do that; maybe the type it is representing is calculated in a way that makes modification non-obvious. However, if the synthetic child provider doesn’t know how to do the update, then the ValueObject should mark itself as “non-editable”, and attempts to change the value should produce a suitable error. In many cases, particularly all these STL collection classes, the underlying data is in memory in a straightforward way, just somewhere that’s annoying to drill down to. In cases like that the provider could figure out how to translate a modification of the synthetic child into a modification of the underlying data structure, and then the change will be made real.

Jim

On Jul 20, 2023, at 6:57 AM, Pavel Kosov via LLVM Discussion Forums notifications@llvm.discoursemail.com wrote:

kpdev42
July 20

I’m trying to debug small c++ program which uses STL containers, let’s take a list for example:


#include <list>
#include <iostream>

using namespace std;

template<typename T>
void print(T& container) {
    for (const auto& v: container) {
        cout << v << ' ';
    }
    cout << endl;
}

void simple_list() {
    list<int> l{2048};
    print(l);
}

int main() {
    simple_list();
    return 0;
}

I’m trying to change list value (from 2048 to any other number) through vscode (lldb-mi) and python scripting in lldb - in both cases I see that this change is reverted after each next line command (please take a look at attached gif). Reproduction steps:

  1. Set breakpoint
  2. Stop at this breakpoinnt, change value of the list element
  3. VSCode shows updated value
  4. Execute next line command in lldb
  5. VSCode shows original value of the list element

I start digging into whats happened and find out that there are different types of ValueTypes for synthetic values in lldb ( https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lldb/include/lldb/Core/Value.h#L41 ), in my case the problem was in HostAddress / LoadAddress - in a comment it is stated that HostAddress is for values created in process which uses liblldb (lldb / lldb-mi) and LoadAddress is for values in debugging process (as I understand, please correct me if I wrong).
When program starts, the list’s values are marked as LoadAddress, but when I change them in VSCode - their type became HostAddress.
This happens in ListFrontEnd::GetChildAtIndex ( https://github.com/llvm/llvm-project/blob/llvmorg-16.0.6/lldb/source/Plugins/Language/CPlusPlus/LibCxxList.cpp#L392 ), and when value is changed, liblldb creates new object in its own memory with type HostAddress.
After that lldb-mi (after each step) sends delete-all-variables request to the debug server ( https://github.com/lldb-tools/lldb-mi/blob/main/src/MICmdCmdVar.cpp#L669 ), and after that on each step lldb-mi requests variables from debugging process’s frame ( https://github.com/lldb-tools/lldb-mi/blob/main/src/MICmdCmdStack.cpp#L850 ).
Therefore all HostAddress values are removed and all changes are lost. I didn’t yet dig into lldb, but I suppose something similar happens there.

So, the question is : is it expected that liblldb frontends (lldb, lldb-mi, etc) should manage HostAddress values properly and save values between steps? Or it should be done somehow somewhere in lldb? At this point it is unclear how to handle such values. If someone can provide helpful information for this - I would really appreciate it (maybe there are some good resources about how lldb works internally)

Note: in this question I took a list only as an example, but same issue happens for all STL containers except map/set (they use same synthetic frontend) which were fixed recently: https://reviews.llvm.org/D140624 (similar fix is applicable for most other containers, but I want to clarify how it was supposed to work initially, maybe properly it should be done in some another way)


Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

@jingham Thank you for detailed explanation. I wonder if you could also clarify the following two questions:

We need to add to the synthetic child interface a “change value”

The thing is that currently there is a debugger interface SetValueFromCString (https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/API/SBValue.h#L112)
which calls the underlying ValueObject’s SetValueFromCString (https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Core/ValueObject.h#L444) -
in my understanding the idea of this interface is to update value in the debuggee process. Is it correct?

STL collection classes … provider could figure out …

Yes, current implementations of STL containers are able to change underlying values using SetValueFromCString, but for some reason they do it in the liblldb memory, not the debuggee process memory.
In the patch I mentioned earlier (for map/set ⚙ D140624 [LLDB] Fixes summary formatter for libc++ map allowing modification of contained value) - we update value directly in the debuggee process, so updated value persist.
Is this patch done in a proper way with related to overall lldb architecture? :slight_smile:
Could we update all STL containers in the same manner as this patch do, or the initial idea was to somehow handle values in the liblldb memory ?
(this question is not about general implementation for all possible synthetic providers, it is about STL containers synthetic wrappers provided with lldb)

I’m not suggesting adding another high-level SBValue API for setting values. What I’m suggesting is that SetValueFromCString for a ValueObject with a synthetic child provider should ask the synthetic child provider directly how to change the value. So in addition to the API’s described in:

https://lldb.llvm.org/use/variable.html#id10

we should have something like set_value(value_str, error) API that SetValueFromCString would route through.

I have to think through the patch you pointed to to see if that’s correct. The SBValues always have to keep a copy of their data in host memory, because we want to be able to support SBValue::GetValueDidChange, and we can only do that by comparing the host side copy with what you find in the target when you stop again. If you patch doesn’t disrupt that behavior, then it should be fine.

There’s no requirement that the synthetic child provider be sufficiently complex that it needs a non-trivial translation from the representation in lldb and the underlying data. But to be fully general (and maybe just to be easier to read) you might need an explicit way to do this, which is what I was suggesting.

Ok, thank you very much :slightly_smiling_face: