RFC: a new packet to set/remove multiple breakpoints

Hello!

This is a proposal for a new kind of packet for the gdbremote protocol, the
MultiBreakpoint packet. Similar to the batching in MultiMemRead1, this packet
combines multiple breakpoint requests.

The motivation is to reduce the packet traffic between the debugger and the
remote in scenarios where many breakpoints need to be toggled repeatedly. One
such scenario involves the following combination of factors:

  1. A fast but not “local” connection to the process being debugged,
  2. The debugger needs to stop multiple times for internal bookkeeping (e.g.
    when a dylib is loaded),
  3. Each stop involves toggling breakpoints.

To give a concrete example, suppose LLDB stops at about 100 dynamic “library
loaded notification” breakpoints. In some platforms, each stop involves
disabling four breakpoints, doing some bookkeeping, and re-enabling the
breakpoints; for a total of eight packets (four z and four Z) exchanged. On
a connection with 20ms of round-trip (packet + answer) latency, this is 100 * 8 * 20ms = 16000ms. If multiple breakpoints could be modified with a single
packet, this would be reduced to 100 * 2 * 20ms = 4000ms (one
MultiBreakpoint, bookkeeping, one MultiBreakpoint).

Addressing each point (1, 2, 3) would make the situation better, however each
poses a unique challenge. Improving the connection latency is hard due to
physical limitations, whereas improving LLDB’s internal bookkeeping is
something that we are looking at 2. Still, there are challenges in the
scale or characteristics of the software being debugged that would benefit from
more efficient breakpoint communication.

This RFC aims to reduce the packet traffic related to breakpoints by:

  • Proposing a new packet.
  • Proposing a way of using this new packet.

The MultiBreakpoint packet

I have a PR documenting the packet: [lldb] Propose MultiBreakpoint extension
to GDB Remote
. However, for simplicity, here is the text:

EDIT: see updated text on the PR or on the comment later on: RFC: a new packet to set/remove multiple breakpoints - #10 by Felipe

MultiBreakpoint:

This packet allows setting and removing multiple breakpoints in one go. It
concatenates multiple Z and z packets, separating them with a ;.
Formally:

$MultiBreakpoint:breakpoint_request[;breakpoint_request]*

Where each breakpoint_request is one of:

* z0,addr,kind
* z1,addr,kind
* z2,addr,kind
* z3,addr,kind
* z4,addr,kind
* Z0,addr,kind[;cond_list…][;cmds:persist,cmd_list…]
* Z1,addr,kind[;cond_list…][;cmds:persist,cmd_list…]
* Z2,addr,kind
* Z3,addr,kind
* Z4,addr,kind

Each field has the same meaning as the corresponding packet in the GDB Remote
Protocol.

Note: there is no ambiguity in using ; as a separator between
breakpoint_requests. According to the GDB Remote Protocol specification, both
cond_list and cmd_list start with an X character and contain no
separators; it follows that a “;z” or “;Z” always indicate the start of a new
request.

The stub must execute the sequence of breakpoint_requests in the order they
appear in the MultiBreakpoint packet. This is not an atomic operation:
individual requests may fail, and the stub must process subsequent requests
upon failure.

The reply consists of a ;-separated sequence of OKs or E strings, one per
breakpoint_request, representing whether the breakpoint_request was
successful.

A stub that supports this packet must include MultiBreakpoint+ in the reply
to qSupported.

Using the MultiBreakpoint packet

Implementing support for this in debugserver and lldb-server is simple, as
it reuses the regular breakpoint processing:

The challenge is finding the right abstraction level in which to batch
breakpoint requests. One way of doing this is to batch at the Breakpoint
class level, so that it enables all of its BreakpointLocations at once. While
this can work, it is not as general as it could be.

A different idea, which is proposed here, is to have the generic Process
class delay enabling/disabling breakpoints until resume/detach/destroy. This
has two advantages: 1) a single packet is sent for each stop regardless of how
many BreakpointSites change; 2) if a BreakpointSite is both disabled and
enabled, no packet is sent for that site.

This requires a few API changes:

  • The Process class becomes the source of truth for the logical “is this
    BreakpointSite enabled”. However, there are some places in the code that
    query the BreakpointSite directly. We propose making the
    BreakpointSite::{Set,Is}Enabled methods private, and having users go through
    Process.
  • For the same reasons, the virtual methods Enabling/Disabling breakpoints are
    now protected; callers must go through Process.
  • Some derived classes from Process actually need to know whether a
    BreakpointSite is physically enabled, as they are doing low level things
    (e.g. when creating a stop reason, did the thread hit a breakpoint site that
    was actually active?). This is addressed through a new method
    IsBreakpointSitePhysicallyEnabled.
  • Error handling now needs to be delayed. Most call-sites today don’t do
    anything with the error anyway. One exception is when a BreakpointSite is
    first created, as the corresponding BreakpointLocation needs to have its site
    updated in case of success. The other two instances are in
    Process::{RemoveConstituentFromBreakpointSite, ClearBreakpointSiteByID}, for
    the same reason.

This idea is implemented in the following stack of patches:

1 Like

I was going to say this is wrong but it’s just that the documentation is just very confusing to me. I can confirm it’s looking for a literal X character.

I do wonder how long “contain no separators” is going to last, if we end up with some other breakpoint packet. This is predicting the future but a more flexible format would be:

numeric_length<non-numberic-separator><N byte packet>

Then you could repeat that any number of times, making a “meta packet” that contains other packets (I think, I’ve not gone looking for counter examples).

In theory this format could be used to either make arbitrary bundles of operations, or to implement MultiMemWrite style things.

So without knowing your plans, I don’t think I can say “we should have a meta-packet instead”, but I do suggest considering using a container format that does not rely on details of today’s z/Z packets. Even if you never use it for another packet, I think it has some benefits.

(if I had thought of this for multimemread, I would have brought it up then)

What if the stub does not support part of the request? It would be OK;;OK; for example. Seems fine, parse it like a normal empty response.

These E strings, is there scope for actual error strings to be in here? Do we have some extended response we might send for these? Because as soon as you do that your ; is now ambiguous.

If the response follows the same “meta packet” format, we could guard against that.

I agree with doing this. You can still probe and look for an empty response if you really want to, and we will have to have error handling for that anyway.

I don’t have much to say on this level of things but I would be interested to know what situations are effected by this.

Certainly seems like bad user experience to seemingly place breakpoints and then later complain. For example if I have placed more breakpoints than there are hardware resources, I would only find out about it when I resume?

       -H ( --hardware )
            Require the breakpoint to use hardware breakpoints.

Of course we do have delayed breakpoints for shared libraries, so this isn’t unprecedented. Probably not as big of a deal as I think it would be, since I’ve never had to think about it before.

MultiPacket;<optional flags/mode>;<len>:<pkt>;<len>:<pkt>

Where flags is just in case we want to define an early return option later.

One concern I have with defining such a general packet is that it leaves many questions open unless we fully define it right now:

  1. Can we put any kinds of packets in there?
  2. If the answer to (1) changes, in the future, from No->Yes (or from No-> a different subset of packets), how does the remote express which version it supports?
  3. Is a server allowed to support a MultiPacket with only a subset of packets in there?

Where flags is just in case we want to define an early return option later.

This sort of illustrates my point: how can we tell if the remote supports those flags? Is it allowed to ignore them? This is usually fine for packets that communicate information (as opposed to making a request), as extra information is always “ignorable”.

These E strings, is there scope for actual error strings to be in here? Do we have some extended response we might send for these? Because as soon as you do that your ; is now ambiguous.

LLDB has an extension to enable error strings, but those are hex encoded ASCII: GDB Remote Protocol Extensions - 🐛 LLDB

So without knowing your plans

No other plans! :slight_smile: But that was also true when I did the work for MultiMemRead

What if the stub does not support part of the request? It would be OK;;OK; for example. Seems fine, parse it like a normal empty response.

Yup, that would be the syntax. I should make that explicit.

For example if I have placed more breakpoints than there are hardware resources

We can change the implementation so that breakpoints that require hardware resources are enabled/disabled eagerly.

To be clear: I don’t mind changing the syntax to make it less reliant on the contents of each breakpoint kind, my worry is about having the general meta packet that would potentially allow mixing arbitrary packets.

We don’t have to define it as a new packet but it can be a design pattern for other packets that have specific names. Even if we never define a second packet (third if you count multimemread), it dodges some assumptions about what z/Z looks like.

I do not think it’s likely that someone wants to throw C expressions in there and start using ;. However if it’s cheap to avoid, I think it’s worth considering.

If we added a MultiPacket then I would be going for a “play silly games win silly prizes” type specification. Where it is the same as sending many individual packets but with fewer opportunities to error handle. Then it’s up to the clients whether they want to give up flexibility.

So I don’t think we should add one, but just to illustrate…

Yes, it’s just like sending them individually but with less chance to error handle.

Stateful things like file reads? Sure, why not, as long as you are able to handle the possible errors.

In reality clients would need a fall back for this. Which increases the cost of adoption of course, and likely leads to only using specific subsets of packets…

Yes.

We start with the field completely empty and then specify that flags must be acted on or an error returned. Then when we add ret_early, the server errors out and we know we can’t trust it to do that. Again, needs a sequential fallback path.

So I don’t think we should add a MultiPacket but I do prefer the general format to relying on properties of the z packets.

We can ignore the issue of options for MutliBreakpoint because I don’t see a use case for dependent breakpoints in a way that we’d want an early return.

You might have hardware linked breakpoints but those would be set in sets. “set A and B and link them” rather than “set A, and if that succeeds, set B”.

This also applies to the list of z/Z packets you propose to allow in MultiBreakpoint. Except the chance of a new packet is far far lower than a new packet generally.

I presume the set of z/Z can expand later (I am thinking of complex hardware breakpoints for this). Which we can do by:

  • Updating the supported list in the packet description, probably in a “might be supported” list on top of the initial set.
  • Updating new versions of lldb-server.
  • Any client code using the new z/Z format will check for it being unsupported as it would if it was sent on its own.

So in summary:

  • I do not think we should actually add a generic multi-packet container packet.
  • The primary reason being that as you have argued, in reality only subsets of packets would end up being used because the error handling gets quite complex and it’s already hard to find opportunities for batching as it is.
  • I do think that relying on the form of the current z/Z is best avoided if it is cheap to do so.
  • I think it’s fine to list the supported forms of z/Z initially, because we do have a path for expansion.

I don’t have anything to really add to the conversation here, except that having some form of MultiPacket is interesting when the debugger has a sequence of packets it will send that don’t have dependencies, whether multiple of the same packet or different packets, is an interesting idea. You know how much I love JSON in gdb RSP packets, despite being the only one so enamored, but I would just throw them in an array of JSON strings – binary quoting protocol on the overall packet, JSON string quoting ("\”), otherwise it’s just the text of the packets that will be processed sequentially, with a response of the packet return strings in another JSON array. It’s not any better than David’s sizeofcmd;sizeofcmd; etc format, just less of a unique formatting.

David had one example of breakpoints A and B, where they are linked in some way, or B is only set if A succeeded. That’s a dependency in the MultiPacket (or MultiBreakpoint here), and so I would say the debugger must send those as separate packets, getting the result from the first before it can do the second. Or we end up needing a more complicated representation of the different packets encapsulated.

I’m not arguing for any of this over Felipe’s proposal, and don’t meant to detract from them. But I sometimes think some of the performance limitations we see with gdb remote serial protocol is from the back-and-forth sequential packet sequences that the protocol is designed for, and when we have a sequence of packets that can be sent without dependencies, it’s an interesting idea to find a way to improve this.

An example, Felipe says a common sequence is “disable breakpoints, instruction step, re-enable breakpoints, continue” - technically there are dependencies between all these operations (if we couldn’t disable the breakpoint, the instruction step will fail; if we couldn’t re-enable the breakpoint, should we continue) but realistically you could see a debugger sending

  1. DisableBreakpoints + instruction step
  2. Re-enableBreakpoints + continue

as two multiple-command bundle packets.

My justification was to not rely on details and JSON is even further down that path so I like the idea.

@Felipe perhaps you could prototype MultiBreakpoint in that JSON form? Enough to see if there’s any obvious problem with it in encoding or ergonomics or whatever else.

Whether we go with it for MultiBreakpoint or not, I think the experience will be useful. Or in other words: I think it’s a really cool idea and my bias is showing :slight_smile:

I was going to disagree but I now realise that I have only been thinking of “return early” rather than doing something else on failure. You’re right we would quickly want an “if then else” capability.

The idea of a JSON based DSL for packets sounds equal parts horrific and intriguing to me so maybe I’ll dig into that sometime. I wonder if I can mine the packet logs for common sequences.

Anyway, yes, a MultiPacket is certainly not something to do at this time.

And now I’ve finished derailing the conversation into protocol details, hopefully folks can discuss the changes in lldb too.

Yeah… I was actually doing it just now and I am slightly puzzled. The code in ProcessGDBRemote.cpp that sends other JSON packets assumes the only time a character in the set #$*} appears is at the end of a JSON dictionary. For example:

// ProcessGDBRemote::GetExtendedInfoForThread
    StreamString packet;
    packet << "jThreadExtendedInfo:";
    args_dict->Dump(packet, false);

    // FIXME the final character of a JSON dictionary, '}', is the escape
    // character in gdb-remote binary mode.  lldb currently doesn't escape
    // these characters in its packet output -- so we add the quoted version of
    // the } character here manually in case we talk to a debugserver which un-
    // escapes the characters at packet read time.
    packet << (char)(0x7d ^ 0x20);

In other words, it only bothers escaping the last character. If args_dict contains any of the special characters… Maybe I need to talk to Jason about JSON.

It would be ideal to avoid ascii-hex encoding the entire JSON payload, as it would make logs unreadable by humans.

Ok, how about this:

MultiBreakpoint

This packet allows setting and removing multiple breakpoints in one go. It
lists multiple Z and z packets using a JSON array of strings:
Formally:

$MultiBreakpoint:["breakpoint_request"[,"breakpoint_request"]*]

Where each breakpoint_request is one of:

* z0,addr,kind
* z1,addr,kind
* z2,addr,kind
* z3,addr,kind
* z4,addr,kind
* Z0,addr,kind[;cond_list…][;cmds:persist,cmd_list…]
* Z1,addr,kind[;cond_list…][;cmds:persist,cmd_list…]
* Z2,addr,kind
* Z3,addr,kind
* Z4,addr,kind

Each field has the same meaning as the corresponding packet in the GDB Remote
Protocol.

The stub must execute the sequence of breakpoint_requests in the order they
appear in the MultiBreakpoint packet. This is not an atomic operation:
individual requests may fail, and the stub must process subsequent requests
upon failure.

The reply consists of a JSON array of strings, with the same contents allowed
by a reply to a z or Z packet.

A stub that supports this packet must include MultiBreakpoint+ in the reply
to qSupported.

I’ve also updated all PRs:

Yeah… I was actually doing it just now and I am slightly puzzled. The code in ProcessGDBRemote.cpp that sends other JSON packets assumes the only time a character in the set #$*} appears is at the end of a JSON dictionary.

lldb’s SendPacket* methods don’t do any filtering of packets, altho we do it on all received packets. The j* packets that send their requests all have a simple top-level JSON dictionary with a few key-values in it, to make a request of debugserver/lldb-server, and that ends in a character that must be escaped, }, and someone (cough) just hacked in that one quoting.

Looking round, ProcessGDBRemote::DoWriteMemory doesn’t use the X packet which would send 8-bit data, but it does use vFlashWrite which sends raw binary data. It passes the data field through StreamGDBRemote::PutEscapedBytes in lldb/source/Utility/GDBRemote.cpp which does the binary escaping. That would be the more correct way for the packets that are escaping the closing } on the argument dictionary to do that.

> $MultiBreakpoint:["breakpoint_request"[,"breakpoint_request"]*]

I would put this in a dictionary to leave the door open for extensibility in the future ({”breakpoint_requests” : […]}' or whatever), and I’d name the packet starting with a j along with the other packets that send/receive in JSON format.

I agree with wrapping it in a dictionary, will work on that now.

I’ve updated the proposal, also made the reply itself be a dictionary in case something needs to change in the future.

I think at this point we can iterate on the PR itself for the text: [lldb] Propose MultiBreakpoint extension to GDB Remote

1 Like