SB C++ API: Determine state of host-side module added after attaching to remote process

deku_scrubbed · April 10, 2024, 6:44pm

I have question regarding adding host-side modules when attaching to already running applications. Some details:

The module on the target is already loaded. But, this would apply to the main module as well, so, the main module is assumed from here on.
LLDB 18 series.
Debug Client is all implemented through the C++ SB API. Built unchanged for Windows.
SBPlatform created as remote-linux
- Remote server is custom, but, generally emulates Linux enough for this platform.
The steps are as follows:
- SBDebugger::Create
- → Platform Select + Connect
- → SBDebugger::CreateTarget(nullptr)
- → SBPlatform::Attach(attach_info)
- → SBTarget::AddModule("path/to/elf", triple, nullptr, "path/to/elf")

At this point everything largely works, and I want to scan through the sections+segments of the ELF to see the Load Address of the remote module itself (again, this is the main ELF in this case), and if I wait to to find the Code Section (currently just sleeping for 800-1200milliseconds) I get the expected Load Address, however, if I just immediately dump the module sections after adding it, I get MAXUINT_64 for the load address of the code section.

My main question is, what is the prescribed method for determining whether or not the module is loaded on the host in this scenario? SBTarget::AddModule doesn’t block, and I don’t see any Target Events coming through, which I suppose only happen when the remote target modules are loaded/unloaded? I could see possibly trying to busy wait on SBModule::GetNumCompileUnits() > 0 or something along those lines? Though, this isn’t necessarily indicative that the Host<->Remote communication to determine the Load Address has occurred.

Thanks so much for any help!

Note: I posted this to Discord as well, apologies if asking in both places is wrong.

labath · April 12, 2024, 8:01am

Are you running the debugger or synchronous or asynchronous mode (async is the default)?

What you’re describing sounds like the behavior in async mode, where the Attach function returns immediately, but debugger actually completes the attach at some later time. To use the async mode, you need to set up a listener and then wait for the state change event (to eStateStopped) before you can start fiddling with the process. You should be able to find some examples of using lldb listeners online or in the source tree. Let us know if you run into problems using that.

If this is just simple script/experiment, then it might be easier to switch the debugger to synchronous mode (debugger.SetAsync(false)), but note that the async mode is recommended for more complicated use cases.

Also, it shouldn’t be necessary to add the module to the target yourself. LLDB should be able to figure out which modules are loaded on its own. If it can’t do that, then you may need to change the target.exec-search-paths setting (before attaching).

deku_scrubbed · April 12, 2024, 5:30pm

Thanks @labath,

You are correct, I am running in async mode. I am currently using the the SBDebugger instance’s default handle specific events (I’m also “dumping” all events I receive on it; Originally I was passing my own listener in with the SBAttachInfo, but, switched to the SBDebugger’s to see if it was getting something I wasn’t.

I’m currently waiting on eStateStopped in the Process StateChanged event to determine I’ve attached, but, you could still be right, and I’ll confirm whether or not the process state is actually what I expect before. Also, apologies, I should have noted the event handling in my original post.

As a quick test, I did try sleeping the main the main thread after the call to attach, and I can see the process events coming through on my event thread, and after that adding the module and enumerating the sections results in the same behaviour. I did test spinning on the load address until it returned something besides LLVM_INVALID_ADDRESS, and that did work, though still no outward indication that those were resolved by the SBModule instance after adding.

If this is just simple script/experiment, then it might be easier to switch the debugger to synchronous mode (debugger.SetAsync(false)), but note that the async mode is recommended for more complicated use cases.

This was my understanding as well. I’m working in two codebases at the moment. One is a simple testbed where I experiment with the SB API and look for new APIs I might want to add – for instance, I’d added the SBPlatform::Attach APIs at one point, but, removed them when I updated to LLVM 18.1 where a similar set had been added. The main “project” is trying to see if I can build a debugger without needing the SBCommandInterpreter, otherwise filling the functionality with new SB APIs.

Also, it shouldn’t be necessary to add the module to the target yourself. LLDB should be able to figure out which modules are loaded on its own. If it can’t do that, then you may need to change the target.exec-search-paths setting (before attaching).

Yeah, this was my original expectation. I thought perhaps the main module was an exception, though, and the simple sample inferior I’m attaching to doesn’t load many modules, so, I decided to figure the main module out first, and then put together a more dynamic sample to play with. I will look at the search paths again as well, and make sure everything looks correct.

labath · April 15, 2024, 11:25am

That fine, don’t worry about it. Make sure you also check that the Stopped event also does not have the “restarted” flag set (like so). This is a very easy mistake to make, and we probably shouldn’t have designed an API with such a big footgun.

deku_scrubbed · April 15, 2024, 6:07pm

Thanks again @labath,

I will test that flag as well, though, the process I’m testing with is running persistently, and launched externally from the debugger (currently), so, I don’t believe I should ever see an event due to a restart.

I still need to clean up to make sure I’m not enumerating code sections prior to the attach break (though, I’m fairly sure now it should be).

I did a little more investigation, and it looks like when I call SBTarget::AddModule, it does go through the process of adding a new module, but, when it goes to send a target event, it can’t find any Listeners to broadcast to. I’ve included the call stack, and it looks like, inside Utility/Broadcaster.cpp Broadcaster::BroadcasterImpl::PrivateBroadcastEvent, is where it tries to find listeners, but, none are registered, and even the internal primary and hijacking listeners.

The two ways I know to use a listener are by using one with the ProcessAttachInfo when attaching with SBPlatform::Attach, or using the one on SBDebugger instance (I don’t do both, just one or the other). Both seem to be the same. Is there something else I might be missing in order to a target event listener registered properly, maybe?

Callstack:

lldb_private::Broadcaster::BroadcasterImpl::PrivateBroadcastEvent(std::shared_ptr<lldb_private::Event> & event_sp, bool unique) Line 238  C++
lldb_private::Broadcaster::BroadcasterImpl::BroadcastEvent(unsigned int event_type, const std::shared_ptr<lldb_private::EventData> & event_data_sp) Line 312  C++
lldb_private::Broadcaster::BroadcastEvent(unsigned int event_type, const std::shared_ptr<lldb_private::EventData> & event_data_sp) Line 178 C++
lldb_private::Target::ModulesDidLoad(lldb_private::ModuleList & module_list) Line 1709  C++
lldb_private::Target::NotifyModuleAdded(const lldb_private::ModuleList & module_list, const std::shared_ptr<lldb_private::Module> & module_sp) Line 1666  C++
lldb_private::ModuleList::AppendImpl(const std::shared_ptr<lldb_private::Module> & module_sp, bool use_notifier) Line 239 C++
lldb_private::ModuleList::Append(const std::shared_ptr<lldb_private::Module> & module_sp, bool notify) Line 244 C++
lldb_private::Target::GetOrCreateModule(const lldb_private::ModuleSpec & module_spec, bool notify, lldb_private::Status * error_ptr) Line 2379  C++
lldb::SBTarget::AddModule(const lldb::SBModuleSpec & module_spec) Line 1524 C++
lldb::SBTarget::AddModule(const char * path, const char * triple, const char * uuid_cstr, const char * symfile) Line 1515 C++

deku_scrubbed · April 18, 2024, 11:45pm

Hi @labath,

I have some new information, not sure if it will be useful, but it’s new.

I’ve setup another test case, without a dedicated event thread, and explicitly waiting for events in the sample between debugger actions. The flow is Roughly

SBDebugger::Create
Platform Select + Connect
SBDebugger::CreateTarget(nullptr)
SBPlatform::Attach(attach_info)
Wait for attach halt (as described in the test
SBTarget::AddModule(“path/to/elf”, triple, nullptr, “path/to/elf”)
Find Code Section in module (target.GetModuleAtIndex, and so on)
Get LoadAddress and print

In this case, I still see LLDB_INVALID_ADDRESS value, but, if I try to spin and re-read the the address until it becomes a valid address, I lock up. Although confoundingly my original test app never sees the valid load address now, either (in all cases module.IsValid() returns true).

I think for now I will try to get this more synchronous version to work instead, but, it remains that if I add the module to the target (which my use case requires), I don’t have a way to determine that it’s section information is ready for use. Some other simpler info is available, for instance, module.GetNumCompileUnits() returns what seems to be a valid value.

Anyway, if there is any other info I could try to get to help narrow down the scope, or if you have any other ideas, it would be very much appreciated.

jasonmolenda · April 19, 2024, 12:27am

Sorry for jumping in on this discussion, I might have missed some context but a brief comment on

When you added your module to the target with AddModule, nothing has set the load addess for that binary, it has merely added it to the Target in lldb. LLDB doesn’t know where the library was loaded in virtual memory.

You mention that some time later load addressee are available — maybe ld.so actually loaded the binary in the process after you attached, and notified lldb where the binary is loaded. You can turn on the ‘log enable lldb dyld’ to see dynamic loader (ld.so) logging as binaries are loaded.

deku_scrubbed · April 19, 2024, 1:03am

Thanks Jason!

Happy to get any thoughts or ideas

What you’re describing is definitely my expectation, though, I will do some targeted testing with logs as you’ve suggested.

In this particular case the module I’m adding is the ELF for the main application which is already running for some time on the target device. Besides the main ELF, I would also expect the other dynamic libraries to be long since loaded as well, but, for now I’m looking into the main application module specifically, as the behavior is most surprising to me there.

I do I have a custom debug server, so, it’s possible I have an error there, but, I’m not seeing any specific client requests when adding the module on the host.

My main questions based on your feedback:

In the scenario where the module isn’t loaded on the target, what is the event or signal I should expect on the host, after adding the module, indicating the load addresses have been resolved to the correct address in the debuggees virtual address space? This is certainly the most common scenario I expect to encounter (barring the main application module itself)
Since this is the main application binary I’m adding, the address would still need resolving with the target, would this need special consideration versus the more general case in dynamic library scenario?

One other thing to note is that I have to attach with the SBPlatform API, not the SBTarget API. I have not figured out why the SBTarget API didn’t work for me, but, stepping through the paths that lldb.exe uses, it seemed similar to what SBPlatform used.

Thank you both for all your help.

deku_scrubbed · April 22, 2024, 10:25pm

Hi @jasonmolenda,

I’ve messed with this some more over the weekend. The above seems to hold true for the most part, and I didn’t find an answer to the questions I had, but, I did observe that if I add the module to the target before I call SBPlatform::Attach, the load addresses seem to populate, and I see Target::SetSectionLoadAddress in lldb/source/Target/Target.cpp being called to do this, with the call stack including attach code paths. Looks like it sends qOffsets packet sometime after the vAttach packet to get the text segment from the target. This does make sense, but, I would still expect to be able to add the module after attaching, and then query the target for load addresses, which doesn’t appear to happen at all.

Again, this is for the main module, but, in the use case I have, I may not be able to guarantee that ELF path will be available on the host at the time I attach. And for sure I may not be able to guarantee that any other dynamic libraries loaded already on the target. Is it a deficiency in the remote server, possibly? Maybe it’s missing functionality to extract any module path information in the ELF to send back to the host?

Maybe my mental model is off, though, too.

Any ideas would be awesome.

EDIT: Quick note that I will look into how the server implements qXfer:libraries*:read packets, as well. I do see it is sent for the qSupported packet, so, the client should attempt to request it. Though, it looks like in the client the packet may only sent when a “rendezvous” breakpoint is hit, which kicks off the DynamicLoader Plugin’s module refresh…

tedwoodward · April 23, 2024, 12:35pm

When you attach, does your lldb walk the link map to get a list of libraries that have already been loaded? I had to make lldb do that for my RTOS so it would see loaded modules automatically.

deku_scrubbed · April 23, 2024, 4:24pm

Hi Ted,

Do you mean in the server? It does generate a response for a qXfer:libraries*:read query from the lldb client for all dynamic modules loaded by the inferior target-side. It builds a single element list for qOffsets with just the main executable module.

It seems like fundamentally, the main issue may be that the only dyld-related query I see the client send is qOffsets, and then only if the main module ELF has been added on the host (with SBTarget::AddModule) before attempting to attach, adding after attach results in the client never sending the packet. And I never see any qXfer:libraries*:read requests being sent, which would be the only way currently implemented to get the dyld modules from the target (besides the main module).

My hope would be to:

Get module meta info on attach. Including some ELF section info(from the target) to get debug info file paths (presumably by way of qXfer:libraries*:read
Use that info to add modules in the client SBTarget.
Have those modules resolve load addresses either from the module info in step 1, or even a new qXfer:libraries*:read query.
For modules that aren’t loaded on the target at all, when they do load, I would expect to see the Module Loadevents from the target, but, I haven’t tested this scenario yet. Or unload events if the unload, of course.

movax-13h · April 24, 2024, 10:05pm

Can’t help with everything here, but, with respect to seeing no Target events, I ran into this as well. I had to explicitly add my listener to my SBTarget’s broadcaster. Something like (module load events shown, other broadcast bits are in SBTarget.h):

target.GetBroadcaster().AddListener(listener,
     lldb::SBTarget::eBroadcastBitModulesLoaded);

This likely won’t help with getting module information from the remote process when “manually” adding modules after attach, but, should at least get events that module changes are happening. I thought I remembered reading somewhere that explicitly adding listeners to broadcasters like this wasn’t necessary, but, maybe that was just for Process events.

Hopefully someone else has more ideas for the module host/remote resolution issue you’re having.

deku_scrubbed · April 25, 2024, 3:58am

Posted on the wrong thread: This suggestion did work to start generating target events, and breakpoint events which I’d just begun working with. I think I’ll still have some issues with the module resolution when adding them after attach, but, this is a huge help still.

tedwoodward · April 25, 2024, 2:00pm

I mean lldb, not the remote stub.

lldb needs to know what libraries have been loaded. These will show up in “target modules list”. This is so lldb can load the symbols and debug info for them. On systems that use a typical dynamic loader, lldb will read the link map to determine this.

The loader and the debugger have a contract - the loader has a special empty function for the debugger to set a breakpoint on, called the rendezvous function. When the loader loads or unloads a library, it will call this function before the load/unload and after the load/unload is done. When the breakpoint is hit, the dyld plugin will get the state (add/remove/consistent). On an add or remove, it will save off the link map and continue. On consistent, it will get the link map and compare it to the saved value. For an add, it will go find and add the libraries that are new. For a remove, it will go and remove the libraries that are gone in the new link map.

Different OSes have different names for the rendezvous function and the link map debug struct. You can see the breakpoint lldb sets with “breakpoint list -i”. The rendezvous function is typically _dl_debug_state or _rtld_debug_state. The data the handler reads is typically _dl_debug or _rtld_debug.

deku_scrubbed · April 27, 2024, 2:23am

Thanks very much, Ted.

I think this is likely what I’m missing, and tracks with the rendezvous breakpoint paths I’d seen in the posix and macOS dynamic loader plugins. I am emulating enough to hit most posix paths here, but, did not emulate rendezvous breaks.

Thank you for explaining, I think I understand the root of my problem now. Very grateful.

jingham · April 30, 2024, 5:27pm

You do have to sign up for events to receive them, but you don’t necessarily have to sign up one by one to instances of a given broadcaster. You can also sign up for events from a broadcaster class. All the broadcasters have a class name (Broadcaster::GetBroadcasterClass) which you can set in a BroadcastEventSpec along with the event bits you want. If you use that to sign up for events, then any time an instance of that broadcaster class is created you will automatically get signed up for its events. You can see this, for instance, in Debugger::DefaultEventHandler.

Topic		Replies	Views
C++ API: How to get information about module load? LLDB	1	65	June 1, 2015
Inquiry about Load Address LLDB	3	227	March 8, 2016
How to get SBTarget before AttachToProcessWithID? LLDB	3	74	February 1, 2016
Questions for module/symbol load/unload events LLDB	10	79	March 1, 2016
Using LLDB C++ API for automated debugging sessions LLDB	6	178	July 19, 2019

SB C++ API: Determine state of host-side module added after attaching to remote process

Related Topics