This is an RFC describing a new type of gdb-remote protocol packet, which
enables a debugger to request multiple memory reads from the inferior at once.
The main motivation is performance: if a debugger knows it will need to read
memory from multiple addresses, it can reduce the number of packets
communicated with the remote server by requesting all reads at once,
effectively reducing the number of packets communicated from 2 * num_reads to
two (the request and the reply). In high-bandwidth but high-latency networks,
this is a major speed-up.
We’ve found that, when debugging Swift programs using Swift’s concurrency
runtime, LLDB has to read up to two memory addresses per OS thread on every
stop. As such, on highly concurrent systems, latency for these reads can create
a visible slowdown in the debugging experience. This is particularly concerning
when doing step operations, where the debugger may privately stop the inferior
multiple times before a public stop is reached.
The new packet
Name: MultiMemRead
Request syntax: MultiMemRead:ranges:<base_addr1>,<num_bytes1>;<base_addr2>,<num_bytes2>;...;
Reply syntax: <num_bytes_read1>,<bytes1>;<num_bytes2>,<bytes2>...;
Each request supplies a base address to read from, and a number of bytes to
read. For each such request, the server replies with the number of bytes that
it was able to read, followed by the bytes read as binary data. Escaping of
special characters ( ^ $ } *) is handled like in the x packet.
There are two kinds of errors that must be handled by the packet:
- An error is returned for the entire packet; for example, the server does not
support this packet. - An error is returned for specific memory addresses; for example, if the
server was unable to read memory from the second address provided.
The first kind can be represented by returning an E packet.
The second kind can be represented by returning zero for the number of bytes
read, for each address whose read operation failed.
Example
send: MultiMemRead:0x100200,4;0x200200,2;0x400000,4;
receive: 4,<binary encoding of abcd1000>;0,;2,<binary encoding of eeff>
The four bytes read from memory address 0x100200 were 0xabcd1000. No bytes
were read from 0x200200. Only two out of the four bytes requested from
0x400000 were read, and they were 0xeeff.
Question: Should the response be given as hexadecimal characters?
This seems wasteful, but are there benefits?
Question: Does the packet need options?
Would there be any use for an options = <something> component to the request?
Question: Should the packet allow for more expressive errors per address?
Instead of returning “zero bytes were read”, should the response contain an E
code in place of the <num_bytes>,<bytes> component of the reply?
LLDB API changes
When the inferior process is stopped and LLDB has to perform a memory read, the
existing Process::ReadMemory virtual method is responsible for checking a
per-stop memory cache before calling the abstract Process::DoReadMemory
method. If this address has not been read before, a large chunk of memory
around the address is read at once and saved in the cache, with the expectation
that nearby addresses may be requested by the debugger in this same stop.
Because caching is done in 512 bytes chunks, adopting a caching layer for the
MultiMemRead packet would make the reply packet exceedingly large. As such,
this RFC proposes bypassing the caching layer for such packets in an initial
implementation, leaving the door open for improvements in the future.
As a reminder, the signature of Process::ReadMemory is:
virtual size_t ReadMemory(
lldb::addr_t vm_addr,
void *buf,
size_t size,
Status &error);
The method is virtual because Process implementations can bypass memory
caching by overriding this method.
To support a MultiMemRead packet implementation, a new ReadMemoryVectorized
method is proposed, which would not query the memory cache:
virtual Expected<SmallVector<size_t>> ReadMemoryVectorized(
ArrayRef<lldb::addr_t> vm_addrs,
ArrayRef<size_t> read_sizes,
MutableArrayRef<uint8_t> buf);
Where:
- The
vm_addrsandread_sizesarguments are a one-to-one map to the
base_addrandnum_bytesarguments ofMultiMemRead. - The length of buffer
bufmust be at least the sum of the sizes in
read_sizes. - The return value is a sequence of lengths, corresponding to how many bytes
were read per requested address. If the overall request failed, an error is
returned instead.
This API reflects the previous design choice of not allowing per-address
errors, instead relying on the number of bytes read to convey failures at a
per-address level. In a way, this has similar semantics to the existing API for
ReadMemory, where programmers are expected to check the Status out-param
and only then check the number of bytes read. In the new API, programmers are
required to check the error before checking the size; this is enforced by
Expected semantics.
A default implementation of ReadMemoryVectorized is provided, which
constructs individual memory read requests without using the new packet kind.
If any request fails and produces an Error object (a Status), the default
implementation can either return an Error for the Expected component of the
API – effectively failing all reads – or it can drop the individual
Error, and set the number of bytes read for the failing address to zero,
while still allowing for the other addresses to be read successfully. The
latter seems more desirable, but feedback is requested.
Concrete Process implementations should override this method if they may
support MultiMemRead. For example, a GDBRemote implementation could look like
this:
Expected<SmallVector<size_t>> ProcessGDBRemote::ReadMemoryVectorized(...) {
if (!m_gdb_comm.GetMultiMemReadSupported()) // a new query
return Process::ReadMemoryVectorized(...)
// use the new packet
}
The implementation should be responsible for breaking up requests that may be
generate packets that are considered too large.
Question: a single buffer, or multiple buffers?
In the API for ReadMemoryVectorized, the proposal uses a single output buffer
for all the results: MutableArrayRef<uint8_t>. Should it instead use one
buffer per memory read, e.g., ArrayRef<MutableArrayRef<uint8_t>>?