RFC: packet to identify a standalone aka firmware binary UUID / location

Hi, I'm working with an Apple team that has a gdb RSP server for JTAG debugging, and we're working to add the ability for it to tell lldb about the UUID and possibly address of a no-dynamic-linker standalone binary, or firmware binary. Discovery of these today is ad-hoc and each different processor has a different way of locating the main binary (and possibly sliding it to the correct load address).

We have two main ways of asking the remote stub about binary images today: jGetLoadedDynamicLibrariesInfos on Darwin systems with debugserver, and qXfer:libraries-svr4: on Linux.

jGetLoadedDynamicLibrariesInfos has two modes: "tell me about all libraries" and "tell me about libraries at these load addresses" (we get notified about libraries being loaded/unloaded as a list of load addresses of the binary images; binaries are loaded in waves on a Darwin system). The returned JSON packet is heavily tailored to include everything lldb needs to know about the binary image so it can match a file it finds on the local disk to the description and not read any memory at debug time -- we get the mach-o header, the UUID, the deployment target OS version, the load address of all the segments. The packets lldb sends to debugserver look like
jGetLoadedDynamicLibrariesInfos:{"fetch_all_solibs":true}
or
jGetLoadedDynamicLibrariesInfos:{"solib_addresses":[4294967296,140733735313408,..]}

qXfer:libraries-svr4: returns an XML description of all binary images loaded, tailored towards an ELF view of binaries from a brief skim of ProcessGDBRemote. I chose not to use this because we'd have an entirely different set of values returned in our xml reply for Mach-O binaries and to eliminate extraneous read packets from lldb, plus we needed a way of asking for a subset of all binary images. A rich UI app these days can link to five hundred binary images, so fetching the full list when only a couple of binaries was just loaded would be unfortunate.

I'm trying to decide whether to (1) add a new qStandaloneBinaryInfo packet which returns the simple gdb RSP style "uuid:<UUID>;address:0xADDR;" response, or (2) if we add a third mode to jGetLoadedDynamicLibrariesInfos (jGetLoadedDynamicLibrariesInfos:{"standalone_binary_image_info":true}) or (3) have the JTAG stub support a qXfer XML request (I wouldn't want to reuse the libraries-svr4 name and return an XML completely different, but it could have a qXfer:standalone-binary-image-info: or whatever).

I figured folks might have opinions on this so I wanted to see if anyone cares before I pick one and get everyone to implement it. For me, I'm inclined towards adding a qStandaloneBinaryInfo packet - the jtag stub already knows how to construct these traditional gdb RSP style responses - but it would be trivially easy for the stub to also assemble a fake XML response as raw text with the two fields.

J

Hi Jason,

A bit of a tangent here, but would you guys consider making your JTAG RSP server a bit more generic and releasing it open source for use with OpenOCD? They've got a stub for gdb, but it needs some work to behave better with lldb.

Ted

Hi Ted, I think that group's code is really specific to our environment and I don't see it being open sourced.

Any reason to not just return any stand alone binary image information along with the dynamic libraries from the "jGetLoadedDynamicLibrariesInfos:{"fetch_all_solibs":true}" or "qXfer:libraries-svr4" packet? If all of the information is the same anyway, no need to treat them any differently. We already return the main executable's info in those packets and that isn't a shared library.

I would vote to stay with the jGetLoadedDynamicLibrariesInfos packet unless you are going to return enough info in the "qXfer:libraries-svr4" packet to allow another debugger to just work when connecting with it. So if you have to add custom mach-o stuff that another debugger wouldn't be able to use anyway to the XML from "qXfer:libraries-svr4", then I don't see the point in using it.

Greg

Hi, I'm working with an Apple team that has a gdb RSP server for JTAG debugging, and we're working to add the ability for it to tell lldb about the UUID and possibly address of a no-dynamic-linker standalone binary, or firmware binary. Discovery of these today is ad-hoc and each different processor has a different way of locating the main binary (and possibly sliding it to the correct load address).

We have two main ways of asking the remote stub about binary images today: jGetLoadedDynamicLibrariesInfos on Darwin systems with debugserver, and qXfer:libraries-svr4: on Linux.

jGetLoadedDynamicLibrariesInfos has two modes: "tell me about all libraries" and "tell me about libraries at these load addresses" (we get notified about libraries being loaded/unloaded as a list of load addresses of the binary images; binaries are loaded in waves on a Darwin system). The returned JSON packet is heavily tailored to include everything lldb needs to know about the binary image so it can match a file it finds on the local disk to the description and not read any memory at debug time -- we get the mach-o header, the UUID, the deployment target OS version, the load address of all the segments. The packets lldb sends to debugserver look like
jGetLoadedDynamicLibrariesInfos:{"fetch_all_solibs":true}
or
jGetLoadedDynamicLibrariesInfos:{"solib_addresses":[4294967296,140733735313408,..]}

qXfer:libraries-svr4: returns an XML description of all binary images loaded, tailored towards an ELF view of binaries from a brief skim of ProcessGDBRemote. I chose not to use this because we'd have an entirely different set of values returned in our xml reply for Mach-O binaries and to eliminate extraneous read packets from lldb, plus we needed a way of asking for a subset of all binary images. A rich UI app these days can link to five hundred binary images, so fetching the full list when only a couple of binaries was just loaded would be unfortunate.

I'm trying to decide whether to (1) add a new qStandaloneBinaryInfo packet which returns the simple gdb RSP style "uuid:<UUID>;address:0xADDR;" response, or (2) if we add a third mode to jGetLoadedDynamicLibrariesInfos (jGetLoadedDynamicLibrariesInfos:{"standalone_binary_image_info":true}) or (3) have the JTAG stub support a qXfer XML request (I wouldn't want to reuse the libraries-svr4 name and return an XML completely different, but it could have a qXfer:standalone-binary-image-info: or whatever).

I figured folks might have opinions on this so I wanted to see if anyone cares before I pick one and get everyone to implement it. For me, I'm inclined towards adding a qStandaloneBinaryInfo packet - the jtag stub already knows how to construct these traditional gdb RSP style responses - but it would be trivially easy for the stub to also assemble a fake XML response as raw text with the two fields.

Any reason to not just return any stand alone binary image information along with the dynamic libraries from the "jGetLoadedDynamicLibrariesInfos:{"fetch_all_solibs":true}" or "qXfer:libraries-svr4" packet? If all of the information is the same anyway, no need to treat them any differently. We already return the main executable's info in those packets and that isn't a shared library.

My preference for an entirely different packet (or different qXfer request) is that it simplifies the ProcessGDBRemote decision of whether there is a user-process DynamicLoader in effect, or and it simplifies the parsing of the returned values because we can't expect the stub to provide everything that lldb-server/debugserver return in jGetLoadedDynamicLibrariesInfos and libraries-svr4; it's a lot of stuff. At the beginning of the debug session when we're sniffing out what type of connection this is, we can try a dedicated packet for getting the standalone binary information and that tells us what it is. Or we can send the "tell me about all the libraries" darwin/elf packet and get back a result which has two possible formats -- the ones from debugserver/lldb-server with all of the information they include, or the minimal response that this JTAG stub can supply.

It may just be laziness on my part, which is why I wanted to raise this here -- whether to create a new packet or to have jGetLoadedDynamicLibrariesInfos/libraries-svr4 return a new style of result and have the parsing code detect which style it is, and decide the dynamic linker based on that. I think the implementation of the former approach, adding a qStandaloneBinaryInfo packet (or whatever), would be easier than reusing one of the existing packets for really different purpose.

How about adding the stand alone binary info as new key value pairs to the response to the qHostInfo or qProcessInfo packets? With JTAG you will want to know information about what you are connecting to, so it kind of makes sense here. The responses to a simple a.out mac binary already have cputype and subtype which are quite mach-o specific.

< 13> send packet: $qHostInfo#9b
< 166> read packet: $cputype:16777223;cpusubtype:8;ostype:macosx;watchpoint_exceptions_received:after;vendor:apple;os_version:10.15.7;maccatalyst_version:13.6;endian:little;ptrsize:8;#00

< 16> send packet: $qProcessInfo#dc
< 193> read packet: $pid:179d1;parent-pid:179d4;real-uid:24069482;real-gid:6fd32dba;effective-uid:24069482;effective-gid:6fd32dba;cputype:1000007;cpusubtype:8;ptrsize:8;ostype:macosx;vendor:apple;endian:little;#00

With JTAG it is also very important to get information on what memory is accessible as some CPUs have memory mapped registers and if you end up reading/writing to these location, they often should not be accessed or if they are accessed must be done as specifically sized accesses (1, 2, 4 or 8 bytes at a time). The qHostInfo or qProcessInfo could be great for that data as well.

Hello Jason, everyone,

It sounds to me like, if the idea is to send a UUID through the link, that (re)using qXfer:libraries-svr4 for this purpose will not help with anything, as this packet knows nothing about UUIDs qXfer:libraries (without svr4) would be slightly better, as it not encode details of the posix dynamic linkers, but it still contains no mention of the UUID, and it is actually not supposed to return the main executable (just the proper libraries).

To retrieve the main executable name, gdb uses `qXfer:exec-file:read`, but this also does not include the UUID, so it's not useful on its own. One could maybe complement it with something like qXfer:exec-uuid:read, but I'm not sure whether I actually like that idea.

As for new packet vs. another mode to jGetLoadedDynamicLibrariesInfos -- I'm don't know. If this is supposed to be used on more systems, then I'd probably go with a new packet, as the existing one is pretty mach-o specific. If this is going to be an Apple thing, then maybe it does not matter so much..

pl