RFC: a new vectorized memory read packet

I apologize if I was unclear. Jim and I were both talking about Exx codes not meaning anything specific in the protocol. I meant to say that we should standardize what they mean in this instance (if we include errors).

FWIW, on the topic of errors, the gdb remote serial protocol does have an extension to add a string after an Exx error code, I think only certain packets adopt it / are expected to emit it.

In the case of a memory read packet like this, we either have (1) action cannot be completed at all - process has exited. Or request packet is improperly formed. An Exx response is sent. Or (2) not all of the memory regions requested were able to be read in full; or at all. This is not out of the ordinary if lldb tries to read beyond allocated pages, or past memory mapped I/O or something like that. All memory read packets are designed to potentially return fewer bytes than requested, and that would be the same behavior here – not all memory regions may be read completely for the requested amount. It’s not an error and shouldn’t be reported as such. (I think I saw some earlier examples were people were wondering if we might want to report an error code for one particular memory region that was unreadable or something – that’s not how the memory read packets normally work.)

Minor aside, but debugserver only uses each Exx value in one place in the source code (or at least it’s supposed to, lol) so if we ever see a particular Exx value in a packet log, we can look at the debugserver sources and see what conditions it is sent in. It’s normally enough to point to the source of the problem.

I’m supportive of this proposal (excluding the offset idea).

That sounds like an entirely different packet to me, but really whether you rename it or negotiate capabilities it’s the same thing with different steps.

In a previous job we did have a register read-modify-write-read-restore packet though, which was along these lines and saved a lot of time probing MIPS registers.

Much like this packet, I’d want to see a few use cases first.

I wonder why not interleave them but I don’t feel strongly one way or another. It’ll come down to how the packet is actually used what order is better for parsing.

One thing that comes to mind though.

The way you have it here means we need to know that there won’t be a set of bytes for the second read. That’s not that hard to deal with but you could have:

receive: 4:<binary>;0:;2:<binary>;

And probably say that if the result is 0 then the separator is optional. 0; is the same as 0:;. So that adds its own sort of corner case but anyway you have a trade off depending on how the packet tends to be used.

Also in this current proposal where does the error go? I presume there can be one error code per read?

Do we want more fidelity than this? I don’t expect we have much to say beyond the fact we failed to read.

Can you explain this in more detail? I don’t doubt your sincerity, I just wonder how many times we’re going to find this sort of pattern outside of this one case.

Maybe we could hack up some instrumentation to find other cases.

Or grep for any “MemoryRead” in the same function.

This name is just descriptive, right? There’s no MemRead that already exists.

Which is fine, just checking because in the past I’ve seen a ReadRegister → ReadRegisters pattern. Where we realised that we should have just implemented ReadRegisters in the first place.

So skipping right to MultiMemRead sounds good to me.

I have asked a GDB colleague whether they recall any proposals for this there, but Google and AI didn’t find anything at all so I suspect it hasn’t come up.

What is the plan for backwards compatibility? I presume the client will check whether this new packet is supported and if not, fall back to the original one. This will be handled at some low layer in lldb so it won’t get in the way.

I think “Vector” has too many meanings outside of debugging to use it here. I think “ReadMemoryRanges” is better (making “ReadMemory” implicitly mean “ReadMemoryRange” singular).

This is not a Vectorised read because:

  • It’s not using vector instructions.
  • It’s not reading like a vector. If this took N addresses but only one size then ok maybe it’s like a vector gather load, but that’s not what this does.

I do see what you’re going for with “Vectorized” but I don’t think most people will take it that way.

I wonder why not interleave them but I don’t feel strongly one way or another. It’ll come down to how the packet is actually used what order is better for parsing.

Besides what you described a bit later regarding the optional separator – which would create two ways of writing the same thing, a downside IMO – I think there are two advantages to keeping all lengths together in the reply: it’s easier for humans to read the log that way (Jason suggests this a few posts above) and the parsing code becomes simpler (just parse the sizes, then iterate over the sizes extracting bytes).

Also in this current proposal where does the error go? I presume there can be one error code per read?

I think the general opinion so far is that we don’t need more fidelity than “either the entire packet is ill-formed or it isn’t”.

don’t expect we have much to say beyond the fact we failed to read.

That seems to be the opinion expressed in this thread so far, so returning “zero bytes were read” would suffice.

Can you explain this in more detail? I don’t doubt your sincerity, I just wonder how many times we’re going to find this sort of pattern outside of this one case.

I can’t think of other cases from the top of my head, but I’ll do some grepping today. That said, I do expect any “green thread” abstraction implemented through LLDB’s OS plugin mechanism to fall into this category. The OS plugin core functionality is to take a list of “real threads” created by the debugger, and produce a new list of threads that will be what the debugger show to the user. This can be accomplished by looking at some language-specific location in memory to answer the question “are you running some green thread? If so, what is its identifier?”. A MultiMemRead would allow an implementer to answer this question for all threads at once.

That’s roughly what happens with Swift’s concurrency model, where “Tasks” are the abstraction provided to programmers. A Task is just (I really mean it) a flow of execution without any of the other baggage normally associated with threads (i.e. no stack, no registers, etc). A Task can be migrated from thread to thread by the runtime, which writes an identifier for the Task being executed by a thread in thread local storage. LLDB, when running swift code, finds the OS plugin we created for that language and, on every stop, will do thread-to-Task mapping in the virtual function OperatingSystem::UpdateThreadList . If you’re curious, that function is actually quite simple.

What is the plan for backwards compatibility? I presume the client will check whether this new packet is supported and if not, fall back to the original one. This will be handled at some low layer in lldb so it won’t get in the way.

Yup. The plan is to provide a Process base class implementation that just calls the single read ReadMemory function on every range. We would also change ProcessGDBRemote to override the implementation; if it is talking to a remote server that has the new packet capability, it will use that, otherwise it will call the base class implementation. Looking at GDBRemoteCommunicationClient.h, you can find a lot of m_supports_something variables that express the capabilities announced by the server.

I think “Vector” has too many meanings outside of debugging to use it here. I think “ReadMemoryRanges

Yup, I agree with your arguments. Let’s adopt ReadMemoryRanges .

For whatever reason my brain defaults to size-data, size-data etc. but can’t explain why. I think I’m thinking of how you’d write a loop like:

for (auto addr, data : memory_range_data)
  // do stuff

Which is a higher level thing anyway and there’s no reason the packet layout has to match what C++ code ends up looking like.

Your way is a bit safer because we can assert that sum(sizes) == sizeof(bytes on the end) without having to trust the sizes as we read through it.

Another argument for sizes first is that size-data is kinda more readable in that you can see what data pairs with what size, but in another way it’s less readable because you have to scan across the data to see the next size. And if we’re honest, humans reading the contents of memory reads has never been easy anyway, so let’s not optimise for it.

So I’m fine with sizes first with the justification that those are the things people would actually read anyway, and we can parse the packet more safely.

Thanks for explaining. So it’s something that reads bits of a struct but not all of it. Which you could do by over-reading, but when that struct is a thread with N types of M registers, that’s a lot of wasted reads.

I thought of signal structures for this. Though they are quite small so the speed up would be negligible.

It would be nice to find at least one use case that exercises this feature in upstream lldb though. So I would be OK doing one because it’s a neat fit, even if in reality, it doesn’t save much time.

I could try to make a counter argument that adding a new packet just for this one use case (though the packet is generic in implementation) is too much. On the other hand I don’t have a reasonable requirement of N use cases, and I can’t justify plucking one out of the air. Plus I know from experience these batching packets do help overall, so I am biased to be positive here.

So I won’t take that line of argument but if you can think of any clever ways to find opportunities, that would be cool because it “feels” like there should be a few.

…I wonder if the expression evaluator could be taught to batch reads? It has to “materialise” variables sometimes and it could probably save all of them up for one big read.

They do not recall any proposals in this area.

And famous last words, but I think any such proposal would look a lot like this one. Sizes and bytes in some order. If they diverged we could do the usual “try GDB’s first then use ours” dance that we do for x - GDB Remote Protocol Extensions - 🐛 LLDB.

Which reminds me, that page says this about x:

To test if this packet is available, send a addr/len of 0:
x0,0
You will get an OK response if it is supported.

I recall this being one of the bad parts of the design but not the details of why.

How would this work with this packet? Would we use the usual “empty response means not supported” (empty being an empty string)? I think that would be fine.

Am thinking about future changes to this packet, whether that be more features, or fixes to mistakes we didn’t see here. In the spirit of “always leave a reserved field in API syscall structures”.

For the request:

  • If existing servers just ignore unknown fields, we just add a new foo: and it’s fine.
  • If not, we look for a new qSupported feature.

For the reply:

  • We know it’s <N sizes>;<one blob of data>. If we find a literal ; on the end, we could treat the following bits as whatever the new thing is.
  • Or again, qSupported.
  • Or because the client knows it sent newthing:, that the reply will have a different format.

Which is good. We have a few ways out of the new box we are building here :slight_smile:

Speaking of future extensions, this is a great example -

I think there’s room to fit an address space ID in there somewhere. Maybe by making a range into address, size, address space ID if the older servers parse permissively.

If not, qSuported if fine. In fact, I suspect ASID support would need lots of changes so we’d probably need an overall “I am ASID aware” flag from the server anyway.

(if you are wondering why I thought so much about this angle, in a previous job we ended up cramming ASIDs into unused bits of a packet structure, it wasn’t pretty)

What I tried to convey in the proposal is that, in this case, the server would explicitly return an Error, just like if a client sent a packet “somepacketthatdoesntexist”.

We could have the very first call to ReadMemoryRegions attempt to use the new packet, and then fallback to the old packet if it gets an error. But this is a bit of an implicit way of communicating – as you said it was considered a weak point of the design of x – so I would favour a qSupported approach right away. In fact, I should include it in this proposal, shouldn’t I? By the way, all of the work for the x packets probably precedes my working life, so I appreciate the insights :slight_smile:

Regarding extensions, I think it’s fine to parse permissively, but I don’t believe a server should reply anything other than an error if it sees a field that it does not recognize, because there is a chance its answer would not be doing what the client requested. To use the address space example, ignoring the address space field could mean reading the wrong memory region, and the client would not know. In that sense, I don’t believe we can escape the qSupported approach for fields added in the future.

Likewise, a reply that does not conform to the expected syntax probably should be treated as invalid, because it could indicate the server and client are not speaking the same language.

The options field was an attempt at keeping the design open for new requests. For example, I could see an options:address_spaces=[0, 2, 3] for a request containing 3 ranges.

The problem with an error as in an error code is you can’t tell if it’s “error you asked for the wrong thing”, “error I have no idea what this packet even is”.

An empty response (raw character sequence ‘$#00’) means the command is not supported by the stub. This way it is possible to extend the protocol. A newer GDB can tell if a command is supported based on that response (but see also qSupported).

So empty reply is the fallback and yeah, qSupported if you want more nuance. You will have to handle this response anyway since old servers will send it.

And with the design of this packet, even 0 length reads would not be empty replies. It would have to be at least 0:;. That was a problem with x, a zero length read would be an empty reply.

And what are you going to do with 0 length reads? Unless there’s precedent elsewhere, I suggest we say they always succeed (and therefore cannot be used to tell if memory is mapped or not).

So empty reply is the fallback and yeah, qSupported if you want more nuance. You will have to handle this response anyway since old servers will send it.

Makes sense, I will update the proposal with more details about this case and describe the qSupported query for this packet.

And what are you going to do with 0 length reads? Unless there’s precedent elsewhere, I suggest we say they always succeed (and therefore cannot be used to tell if memory is mapped or not).

I think they are “vacuously” successful, to borrow a mathematical term; the server was asked to read nothing, and it did just that.

I’ve posted a couple of PRs, one with the documentation update, and one with a debugserver implementation for the packet. I tweaked a couple of commas/semicolons in the protocol to make parsing easier and more consistent with other packets:

[gdbremote] Document MultiMemRead packet in protocol extensions - https://github.com/llvm/llvm-project/pull/162675

[debugserver] Implement MultiMemRead packet - https://github.com/llvm/llvm-project/pull/162670

I will soon post a couple more PRs with the Process API updates.

[lldb] Parse qSupported MultiMemRead tag in GDB Remote Client - https://github.com/llvm/llvm-project/pull/163249

Regarding this, can you restate why we need a qSupported tag?

Given that there is a standard way to say the stub does not support a packet, which is an empty response: Standard Replies (Debugging with GDB)

Which we can do with this packet due to it’s design, where we can’t with x.

But then this is undermined by GDB’s own description of qSupported:

Other features may describe packets which could be automatically probed for, but are not.

We could automatically probe for it, and we have an obvious fallback if it’s not supported.

These features must be reported before GDB will use them. This “default unsupported” behavior is not appropriate for all packets, but it helps to keep the initial connection time under control with new versions of GDB which support increasing numbers of packets.

This seems to be on the theme of this new packet.

So the choice is kinda arbitrary but I’d like to get on the record what specifically you want to get from making this choice. It may be useful for future extensions.

Regarding this, can you restate why we need a qSupported tag?

I think I led Felipe to this position.

As Felipe pointed out to me, while implementing this, the gdb remote serial protocol doesn’t have any standard way of matching the packet name. So a stub that doesn’t support MultiMemRead but does support M will try to parse ultiMemRead as addr,len, fail, and return an Exx code. You’ll never get back an empty reply for MultiMemRead because a substring packet exists in the stub.

Any code that is testing for this packet would have to understand that an error return may indicate that the packet is not implemented. Or it could indicate some error from the stub in parsing the packet or completing the operation. It’s unknowable which it is.

Maybe I’m off, I never thought of this issue before, somehow, but I don’t think an empty response can work given the existence of m/M in every stub already.

This may not be an appropriate use of qSupported; we have alternates already in lldb like qStepPacketSupported, QThreadSuffixSupported, qStepPacketSupported. maybe that’s the way to go.

Ugh, right. This is the biggest reason then.

It doesn’t say anything about this explicitly in Packets (Debugging with GDB), but:

‘v’
Packets starting with ‘v’ are identified by a multi-letter name, up to the first ‘;’ or ‘?’ (or the end of the packet).

Being the only one called out implies that this is an issue with the others.

Agreed. We’ll have to handle it if the client receives it, but we can’t draw a definitive conclusion from it.

Work for new packet > work for new qSupported tag in my mind.

So the justification I would personally go with, in order of priority, is:

  • There is an existing packet whose name is a prefix of MultiMemRead, so an empty packet response is ambiguous.
  • We would also be taking a risk that the stub interprets this as a valid M request.
  • We send qSupported anyway, so using it saves us sending 1 extra packet in the case where the stub does not support MultiMemRead.
  • The amount of code to do the probing is basically the same as to look for a qSupported tag, and we can do it in a more convenient place. Rather than right before trying to read memory.

Thank you Jason, David, for the input and summaries! I’ll add the justifications you described in the PRs.