Hello!
This is a proposal for a new kind of packet for the gdbremote protocol, the
MultiBreakpoint packet. Similar to the batching in MultiMemRead1, this packet
combines multiple breakpoint requests.
The motivation is to reduce the packet traffic between the debugger and the
remote in scenarios where many breakpoints need to be toggled repeatedly. One
such scenario involves the following combination of factors:
- A fast but not “local” connection to the process being debugged,
- The debugger needs to stop multiple times for internal bookkeeping (e.g.
when a dylib is loaded), - Each stop involves toggling breakpoints.
To give a concrete example, suppose LLDB stops at about 100 dynamic “library
loaded notification” breakpoints. In some platforms, each stop involves
disabling four breakpoints, doing some bookkeeping, and re-enabling the
breakpoints; for a total of eight packets (four z and four Z) exchanged. On
a connection with 20ms of round-trip (packet + answer) latency, this is 100 * 8 * 20ms = 16000ms. If multiple breakpoints could be modified with a single
packet, this would be reduced to 100 * 2 * 20ms = 4000ms (one
MultiBreakpoint, bookkeeping, one MultiBreakpoint).
Addressing each point (1, 2, 3) would make the situation better, however each
poses a unique challenge. Improving the connection latency is hard due to
physical limitations, whereas improving LLDB’s internal bookkeeping is
something that we are looking at 2. Still, there are challenges in the
scale or characteristics of the software being debugged that would benefit from
more efficient breakpoint communication.
This RFC aims to reduce the packet traffic related to breakpoints by:
- Proposing a new packet.
- Proposing a way of using this new packet.
The MultiBreakpoint packet
I have a PR documenting the packet: [lldb] Propose MultiBreakpoint extension
to GDB Remote. However, for simplicity, here is the text:
EDIT: see updated text on the PR or on the comment later on: RFC: a new packet to set/remove multiple breakpoints - #10 by Felipe
MultiBreakpoint:
This packet allows setting and removing multiple breakpoints in one go. It
concatenates multiple Z and z packets, separating them with a ;.
Formally:
$MultiBreakpoint:breakpoint_request[;breakpoint_request]*
Where each breakpoint_request is one of:
* z0,addr,kind
* z1,addr,kind
* z2,addr,kind
* z3,addr,kind
* z4,addr,kind
* Z0,addr,kind[;cond_list…][;cmds:persist,cmd_list…]
* Z1,addr,kind[;cond_list…][;cmds:persist,cmd_list…]
* Z2,addr,kind
* Z3,addr,kind
* Z4,addr,kind
Each field has the same meaning as the corresponding packet in the GDB Remote
Protocol.
Note: there is no ambiguity in using ; as a separator between
breakpoint_requests. According to the GDB Remote Protocol specification, both
cond_list and cmd_list start with an X character and contain no
separators; it follows that a “;z” or “;Z” always indicate the start of a new
request.
The stub must execute the sequence of breakpoint_requests in the order they
appear in the MultiBreakpoint packet. This is not an atomic operation:
individual requests may fail, and the stub must process subsequent requests
upon failure.
The reply consists of a ;-separated sequence of OKs or E strings, one per
breakpoint_request, representing whether the breakpoint_request was
successful.
A stub that supports this packet must include MultiBreakpoint+ in the reply
to qSupported.
Using the MultiBreakpoint packet
Implementing support for this in debugserver and lldb-server is simple, as
it reuses the regular breakpoint processing:
- [debugserver][NFCI] Factor out logic handling breakpoint packets
- [debugserver] Implement MultiBreakpoint
- [lldb-server][NFC] Factor out code handling breakpoint packets
- [lldb-server] Implement support for MultiBreakpoint packet
The challenge is finding the right abstraction level in which to batch
breakpoint requests. One way of doing this is to batch at the Breakpoint
class level, so that it enables all of its BreakpointLocations at once. While
this can work, it is not as general as it could be.
A different idea, which is proposed here, is to have the generic Process
class delay enabling/disabling breakpoints until resume/detach/destroy. This
has two advantages: 1) a single packet is sent for each stop regardless of how
many BreakpointSites change; 2) if a BreakpointSite is both disabled and
enabled, no packet is sent for that site.
This requires a few API changes:
- The Process class becomes the source of truth for the logical “is this
BreakpointSiteenabled”. However, there are some places in the code that
query theBreakpointSitedirectly. We propose making the
BreakpointSite::{Set,Is}Enabledmethods private, and having users go through
Process. - For the same reasons, the virtual methods Enabling/Disabling breakpoints are
nowprotected; callers must go through Process. - Some derived classes from
Processactually need to know whether a
BreakpointSiteis physically enabled, as they are doing low level things
(e.g. when creating a stop reason, did the thread hit a breakpoint site that
was actually active?). This is addressed through a new method
IsBreakpointSitePhysicallyEnabled. - Error handling now needs to be delayed. Most call-sites today don’t do
anything with the error anyway. One exception is when aBreakpointSiteis
first created, as the correspondingBreakpointLocationneeds to have its site
updated in case of success. The other two instances are in
Process::{RemoveConstituentFromBreakpointSite, ClearBreakpointSiteByID}, for
the same reason.
This idea is implemented in the following stack of patches: