[RFC] Improving protocol-level compatibility between LLDB and GDB

Hi, everyone.

I'm considering running some effort to improve protocol-level
compatibility between LLDB and GDB. I'd like to hear if there's
interest in such patches being accepted into LLDB.

My goal would be to make it possible to use LLDB to connect to gdbserver
(and other servers implementing the same protocol), and to be able to
use full set of debugger's features while doing that. Ideally, also
connecting to lldb-server from GDB would be supported too.

I think the first blocker towards this project are existing
implementation bugs in LLDB. For example, the vFile implementation is
documented as using incorrect data encoding and open flags. This is not
something that can be trivially fixed without breaking compatibility
between different versions of LLDB.

My current idea would be to add some logic to distinguish the current
(i.e. 'old') versions of LLDB from GDB, and to have new versions of LLDB
indicate GDB protocol fixes via qSupported.

For example, unless I'm mistaken 'QThreadSuffixSupported' is purely
an LLDB extension. Let's say we implement GDB-compatible vFile packets
as 'gdb-compat:vFile' feature.

The client would:

1. Send 'gdb-compat:vFile' in qSupported to indicate that it's ready to
use correct GDB-style packets.

2. Check for server's qSupported response. Now:

- if it contains 'gdb-compat:vFile+', then we're dealing with new
version of lldb-server and we use gdb-style vFile packets,

- otherwise, if it contains 'QThreadSuffixSupported+', then we're
dealing with old version of lldb-server and we use lldb-style vFile
packets,

- otherwise, we assume we're dealing with real GDB, and we use gdb-style
packets.

On the server-side, we would similarly check for 'gdb-compat:vFile+' in
client's qSupported, and for the call to 'QThreadSuffixSupported' to
determine whether we're dealing with GDB or LLDB client.

What do you think?

Hi, everyone.

I'm considering running some effort to improve protocol-level
compatibility between LLDB and GDB. I'd like to hear if there's
interest in such patches being accepted into LLDB.

Yes!

My goal would be to make it possible to use LLDB to connect to gdbserver
(and other servers implementing the same protocol), and to be able to
use full set of debugger's features while doing that. Ideally, also
connecting to lldb-server from GDB would be supported too.

That would be great. Anything we can do to make GDB and LLDB play with any GDB servers and lldb-servers would be very nice.

I think the first blocker towards this project are existing
implementation bugs in LLDB. For example, the vFile implementation is
documented as using incorrect data encoding and open flags. This is not
something that can be trivially fixed without breaking compatibility
between different versions of LLDB.

We should just fix this bug in LLDB in both LLDB's logic and lldb-server IMHO. We typically distribute both "lldb" and "lldb-server" together so this shouldn't be a huge problem.

My current idea would be to add some logic to distinguish the current
(i.e. 'old') versions of LLDB from GDB, and to have new versions of LLDB
indicate GDB protocol fixes via qSupported.

For example, unless I'm mistaken 'QThreadSuffixSupported' is purely
an LLDB extension.

I believe it. We did this because having to first select a thread and then read a register, means you have to send the two packets and make sure no other packets are sent in between. We did a lot of work to reduce the number of packets between the debugger and the GDB server as latency is what slows down debug sessions and if you have high latency, and send a lot more packets, then things slow down quite a bit.

Let's say we implement GDB-compatible vFile packets
as 'gdb-compat:vFile' feature.

The client would:

1. Send 'gdb-compat:vFile' in qSupported to indicate that it's ready to
use correct GDB-style packets.

2. Check for server's qSupported response. Now:

- if it contains 'gdb-compat:vFile+', then we're dealing with new
version of lldb-server and we use gdb-style vFile packets,

- otherwise, if it contains 'QThreadSuffixSupported+', then we're
dealing with old version of lldb-server and we use lldb-style vFile
packets,

- otherwise, we assume we're dealing with real GDB, and we use gdb-style
packets.

On the server-side, we would similarly check for 'gdb-compat:vFile+' in
client's qSupported, and for the call to 'QThreadSuffixSupported' to
determine whether we're dealing with GDB or LLDB client.

What do you think?

I would be fine just fixing the bugs in the LLDB implementation and move forward. Happy to hear others chime in though if they feel differently.

The other main issue LLDB has when using other GDB servers is the dynamic register information is not enough for debuggers to live on unless there is some hard coded support in the debugger that can help fill in register numberings. The GDB server has its own numbers, and that is great, but in order to truly be dynamic, we need to know the compiler register number (such as the reg numbers used for .eh_frame) and the DWARF register numbers for debug info that uses registers numbers (these are usually the same as the compiler register numbers, but they do sometimes differ (like x86)). LLDB also likes to know "generic" register numbers like which register it the PC (RIP for x86_64, EIP for x86, etc), SP, FP and a few more. lldb-server has extensions for this so that the dynamic register info it emits is enough for LLDB. We have added extra key/value pairs to the XML that is retrieved via "target.xml" so that it can be complete. See the function in lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:

bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
                    GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP abi_sp,
                    uint32_t &reg_num_remote, uint32_t &reg_num_local);

There are many keys we added: "encoding", "format", "gcc_regnum", "ehframe_regnum", "dwarf_regnum", "generic", "value_regnums", "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"

I look forward to seeing any patches that help move this forward!

Greg Clayton

> I think the first blocker towards this project are existing
> implementation bugs in LLDB. For example, the vFile implementation is
> documented as using incorrect data encoding and open flags. This is not
> something that can be trivially fixed without breaking compatibility
> between different versions of LLDB.

We should just fix this bug in LLDB in both LLDB's logic and lldb-server IMHO. We typically distribute both "lldb" and "lldb-server" together so this shouldn't be a huge problem.

Hmm, I've focused on this because I recall hearing that OSX users
sometimes run new client against system server... but now I realized
this isn't relevant to LLGS ;-). Still, I'm happy to do things
the right way if people feel like it's needed, or the easy way if it's
not.

The other main issue LLDB has when using other GDB servers is the dynamic register information is not enough for debuggers to live on unless there is some hard coded support in the debugger that can help fill in register numberings. The GDB server has its own numbers, and that is great, but in order to truly be dynamic, we need to know the compiler register number (such as the reg numbers used for .eh_frame) and the DWARF register numbers for debug info that uses registers numbers (these are usually the same as the compiler register numbers, but they do sometimes differ (like x86)). LLDB also likes to know "generic" register numbers like which register it the PC (RIP for x86_64, EIP for x86, etc), SP, FP and a few more. lldb-server has extensions for this so that the dynamic register info it emits is enough for LLDB. We have added extra key/value pairs to the XML that is retrieved via "target.xml" so that it can be complete. See the function in lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:

bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP abi_sp,
uint32_t &reg_num_remote, uint32_t &reg_num_local);

There are many keys we added: "encoding", "format", "gcc_regnum", "ehframe_regnum", "dwarf_regnum", "generic", "value_regnums", "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"

Yes, this is probably going to be the hardest part. While working
on plugins, I've found LLDB register implementation very hard to figure
out, especially that the plugins seem to be a mix of new, old and older
solutions to the same problem.

We will probably need more ground-level design changes too. IIRC lldb
sends YMM registers as a whole (i.e. with duplication with XMM
registers) while GDB sends them split like in XSAVE. I'm not yet sure
how to handle this best -- if we don't want to push the extra complexity
on plugins, it might make sense to decouple the packet format from
the data passed to plugins.

I am very happy to see this effort and I fully encourage it.

I think the first blocker towards this project are existing
implementation bugs in LLDB. For example, the vFile implementation is
documented as using incorrect data encoding and open flags. This is not
something that can be trivially fixed without breaking compatibility
between different versions of LLDB.

We should just fix this bug in LLDB in both LLDB's logic and lldb-server IMHO. We typically distribute both "lldb" and "lldb-server" together so this shouldn't be a huge problem.

Hmm, I've focused on this because I recall hearing that OSX users
sometimes run new client against system server... but now I realized
this isn't relevant to LLGS ;-). Still, I'm happy to do things
the right way if people feel like it's needed, or the easy way if it's
not.

The vFile packets are, used in the "platform" mode of the connection (which, btw, is also something that gdb does not have), and that is implemented by lldb-server on all hosts (although I think apple may have some custom platform implementations as well). In any case though, changing flag values on the client will affect all servers that it communicates with, regardless of the platform.

At one point, Jason cared enough about this to add a warning about not changing these constants to the code. I'd suggest checking with him whether this is still relevant.

Or just going with your proposed solution, which sounds perfectly reasonable to me....

The other main issue LLDB has when using other GDB servers is the dynamic register information is not enough for debuggers to live on unless there is some hard coded support in the debugger that can help fill in register numberings. The GDB server has its own numbers, and that is great, but in order to truly be dynamic, we need to know the compiler register number (such as the reg numbers used for .eh_frame) and the DWARF register numbers for debug info that uses registers numbers (these are usually the same as the compiler register numbers, but they do sometimes differ (like x86)). LLDB also likes to know "generic" register numbers like which register it the PC (RIP for x86_64, EIP for x86, etc), SP, FP and a few more. lldb-server has extensions for this so that the dynamic register info it emits is enough for LLDB. We have added extra key/value pairs to the XML that is retrieved via "target.xml" so that it can be complete. See the function in lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:

bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP abi_sp,
uint32_t &reg_num_remote, uint32_t &reg_num_local);

There are many keys we added: "encoding", "format", "gcc_regnum", "ehframe_regnum", "dwarf_regnum", "generic", "value_regnums", "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"

Yes, this is probably going to be the hardest part. While working
on plugins, I've found LLDB register implementation very hard to figure
out, especially that the plugins seem to be a mix of new, old and older
solutions to the same problem.

We will probably need more ground-level design changes too. IIRC lldb
sends YMM registers as a whole (i.e. with duplication with XMM
registers) while GDB sends them split like in XSAVE. I'm not yet sure
how to handle this best -- if we don't want to push the extra complexity
on plugins, it might make sense to decouple the packet format from
the data passed to plugins.

Yes, this is definitely going to be the trickiest part, and probably deserves its own RFC. However, I want to note that in the past discussions, the consensus (between Jason and me, at least) has been to move away from this "rich" register information transfer. For one, because we have this information coded into the client anyway (as people want to communicate with gdb-like stubs).

So, the first, and hopefully not too hard, step towards that could be to get lldb-server to stop sending these extra fields (and fix anything that breaks as a result).

pl

I am very happy to see this effort and I fully encourage it.

Completely agree. Thanks for Cc:'ing me Pavel, I hadn't seen Michał's thread.

I think the first blocker towards this project are existing
implementation bugs in LLDB. For example, the vFile implementation is
documented as using incorrect data encoding and open flags. This is not
something that can be trivially fixed without breaking compatibility
between different versions of LLDB.

We should just fix this bug in LLDB in both LLDB's logic and lldb-server IMHO. We typically distribute both "lldb" and "lldb-server" together so this shouldn't be a huge problem.

Hmm, I've focused on this because I recall hearing that OSX users
sometimes run new client against system server... but now I realized
this isn't relevant to LLGS ;-). Still, I'm happy to do things
the right way if people feel like it's needed, or the easy way if it's
not.

The vFile packets are, used in the "platform" mode of the connection (which, btw, is also something that gdb does not have), and that is implemented by lldb-server on all hosts (although I think apple may have some custom platform implementations as well). In any case though, changing flag values on the client will affect all servers that it communicates with, regardless of the platform.

At one point, Jason cared enough about this to add a warning about not changing these constants to the code. I'd suggest checking with him whether this is still relevant.

Or just going with your proposed solution, which sounds perfectly reasonable to me....

The main backwards compatibility issue for Apple is that lldb needs to talk to old debugservers on iOS devices, where debugserver can't be updated. I know of three protocol bugs we have today:

vFile:open flags
vFile:pread/pwrite base https://bugs.llvm.org/show_bug.cgi?id=47820
A packet base https://bugs.llvm.org/show_bug.cgi?id=42471

debugserver doesn't implement vFile packets. So for those, we only need to worry about lldb/lldb-server/lldb-platform.

lldb-platform is a freestanding platform packets stub I wrote for Darwin systems a while back. Real smol, it doesn't link to/use any llvm/lldb code. I never upstreamed it because it doesn't really fit in with llvm/lldb projects in any way and it's not super interesting, it is very smol and simple. I was tired of tracking down complicated bugs and wanted easier bugs. It implements the vFile packets; it only does the platform packets and runs debugserver for everything else.

Technically a modern lldb could need to communicate with an old lldb-platform, but it's much more of a corner case and I'm not super worried about it, we can deal with that inside Apple (that is, I can be responsible for worrying about it.)

For vFile:open and vFile:pread/pwrite, I say we just change them in lldb/lldb-server and it's up to me to change them in lldb-platform at the same time.

For the A packet, debugserver is using base 10,

    errno = 0;
    arglen = strtoul(buf, &c, 10);
    if (errno != 0 && arglen == 0) {
      return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
                                    "arglen not a number on 'A' pkt");
    }
[..]
    errno = 0;
    argnum = strtoul(buf, &c, 10);
    if (errno != 0 && argnum == 0) {
      return HandlePacket_ILLFORMED(__FILE__, __LINE__, p,
                                    "argnum not a number on 'A' pkt");
    }

as does lldb,

    packet.PutChar('A');
    for (size_t i = 0, n = argv.size(); i < n; ++i) {
      arg = argv[i];
      const int arg_len = strlen(arg);
      if (i > 0)
        packet.PutChar(',');
      packet.Printf("%i,%i,", arg_len * 2, (int)i);
      packet.PutBytesAsRawHex8(arg, arg_len);

and lldb-server,

    // Decode the decimal argument string length. This length is the number of
    // hex nibbles in the argument string value.
    const uint32_t arg_len = packet.GetU32(UINT32_MAX);
    if (arg_len == UINT32_MAX)
      success = false;
    else {
      // Make sure the argument hex string length is followed by a comma
      if (packet.GetChar() != ',')
        success = false;
      else {
        // Decode the argument index. We ignore this really because who would
        // really send down the arguments in a random order???
        const uint32_t arg_idx = packet.GetU32(UINT32_MAX);

uint32_t StringExtractor::GetU32(uint32_t fail_value, int base) {
  if (m_index < m_packet.size()) {
    char *end = nullptr;
    const char *start = m_packet.c_str();
    const char *cstr = start + m_index;
    uint32_t result = static_cast<uint32_t>(::strtoul(cstr, &end, base));

where 'base' defaults to 0 which strtoul treats as base 10 unless the number starts with 0x.

The A packet one is the trickiest to clean up IMO. We have two signals that can be useful. debugserver response to the qGDBServerVersion packet,

(lldb) process plugin packet send qGDBServerVersion
  packet: qGDBServerVersion
response: name:debugserver;version:1205.2;

which hilariously no one else does. This can tell us definitively that we're talking to debugserver. And we can add a feature request to qSupported, like

send packet: "qSupported:xmlRegisters=i386,arm,mips,arc;a-packet-base16;"
read packet: "qXfer:features:read+;PacketSize=20000;qEcho+;a-packet-base16+"

This tells us that we're talking to a debugserver that can handle base16 numbers in A, and it will expect them. And we can test if the remote stub is debugserver. if it's debugserver and it did not say it supports this, then we need to send base10.

The other main issue LLDB has when using other GDB servers is the dynamic register information is not enough for debuggers to live on unless there is some hard coded support in the debugger that can help fill in register numberings. The GDB server has its own numbers, and that is great, but in order to truly be dynamic, we need to know the compiler register number (such as the reg numbers used for .eh_frame) and the DWARF register numbers for debug info that uses registers numbers (these are usually the same as the compiler register numbers, but they do sometimes differ (like x86)). LLDB also likes to know "generic" register numbers like which register it the PC (RIP for x86_64, EIP for x86, etc), SP, FP and a few more. lldb-server has extensions for this so that the dynamic register info it emits is enough for LLDB. We have added extra key/value pairs to the XML that is retrieved via "target.xml" so that it can be complete. See the function in lldb/source/Plugins/Process/gdb-remote/ProcessGDBRemote.cpp:

bool ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
                    GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP abi_sp,
                    uint32_t &reg_num_remote, uint32_t &reg_num_local);

There are many keys we added: "encoding", "format", "gcc_regnum", "ehframe_regnum", "dwarf_regnum", "generic", "value_regnums", "invalidate_regnums", "dynamic_size_dwarf_expr_bytes"

Yes, this is probably going to be the hardest part. While working
on plugins, I've found LLDB register implementation very hard to figure
out, especially that the plugins seem to be a mix of new, old and older
solutions to the same problem.
We will probably need more ground-level design changes too. IIRC lldb
sends YMM registers as a whole (i.e. with duplication with XMM
registers) while GDB sends them split like in XSAVE. I'm not yet sure
how to handle this best -- if we don't want to push the extra complexity
on plugins, it might make sense to decouple the packet format from
the data passed to plugins.

Yes, this is definitely going to be the trickiest part, and probably deserves its own RFC. However, I want to note that in the past discussions, the consensus (between Jason and me, at least) has been to move away from this "rich" register information transfer. For one, because we have this information coded into the client anyway (as people want to communicate with gdb-like stubs).

Yes agree. The remote stub should tell us a register name, what register it wants to use to refer to it. Everything else is gravy (unnecessary). If we want to support the g/G packets, I would like to get the offset into the g/G packet.

The rest of the register numbers -- eh_frame, dwarf, ABI argument register convenience names -- comes from the ABI. We can use the register names to match these up - the stub says "I've got an 'r12' and I will refer to it as register number 53" and lldb looks up r12 and gets the rest of the register information from that. It assumes we can all agree on register names, but otherwise I think it's fine.

As for xmm/ymm/zmm, Greg has a scheme where we can specify registers that overlap in the target.xml returned. This allows us to say we have register al/ah/ax/eax/rax and that they're all the same actual register, so if any one of them is modified, they're all modified. e.g.

  <reg name="rax" regnum="0" offset="0" bitsize="64" group="general" group_id="1" ehframe_regnum="0" dwarf_regnum="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="eax" regnum="21" offset="0" bitsize="32" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="ax" regnum="37" offset="0" bitsize="16" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="ah" regnum="53" offset="1" bitsize="8" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

  <reg name="al" regnum="57" offset="0" bitsize="8" group="general" group_id="1" value_regnums="0" invalidate_regnums="0,21,37,53,57"/>

debugserver sends up the value of rax at a stop along with all the GPRs,

< 19> send packet: $vCont;s:183b166#8d
< 626> read packet: $T05thread:183b166;threads:183b166;thread-pcs:100003502;00:f034000001000000;...

and I can print any of these variants without any more packets being sent, because lldb knows it already has all of them.

(lldb) reg read al ah ax eax rax
      al = 0xf0
      ah = 0x34
      ax = 0x34f0
     eax = 0x000034f0
     rax = 0x00000001000034f0 a.out`main at b.cc:3
(lldb)

(I had packet logging turned on here, you'll have to take my word that no packets were sent :wink:

debugserver describes xmm/ymm/zmm the same, so when I go to read one, it gets the full register contents -

(lldb) reg read xmm0
< 23> send packet: $p63;thread:183b166;#9c
< 132> read packet: $ffff0000000000000000000000ff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000#00
    xmm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x00 0x00}

(lldb) reg read ymm0
    ymm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}

(lldb) reg read zmm0
    zmm0 = {0xff 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00}
(lldb)

So my request to read xmm0 actually fetched zmm0.

J

I was looking at lldb-platform and I noticed I implemented the A packet in it, and I was worried I might have the same radix error as lldb in there, but this code I wrote made me laugh:

    const char *p = pkt.c_str() + 1; // skip the 'A'
    std::vector<std::string> packet_contents = get_fields_from_delimited_string (p, ',');
    std::vector<std::string> inferior_arguments;
    std::string executable_filename;

    if (packet_contents.size() % 3 != 0)
    {
        log_error ("A packet received with fields that are not a multiple of 3: %s\n", pkt.c_str());
    }

    unsigned long tuples = packet_contents.size() / 3;
    for (int i = 0; i < tuples; i++)
    {
        std::string length_of_argument_str = packet_contents[i * 3];
        std::string argument_number_str = packet_contents[(i * 3) + 1];
        std::string argument = decode_asciihex (packet_contents[(i * 3) + 2].c_str());

        int len_of_argument;
        if (ascii_to_i (length_of_argument_str, 16, len_of_argument) == false)
            log_error ("Unable to parse length-of-argument field of A packet: %s in full packet %s\n",
                       length_of_argument_str.c_str(), pkt.c_str());

        int argument_number;
        if (ascii_to_i (argument_number_str, 16, argument_number) == false)
            log_error ("Unable to parse argument-number field of A packet: %s in full packet %s\n",
                       argument_number_str.c_str(), pkt.c_str());

        if (argument_number == 0)
        {
            executable_filename = argument;
        }
        inferior_arguments.push_back (argument);
    }

These A packet fields give you the name of the binary and the arguments to pass on the cmdline. My guess is at some point in the past the arguments were not asciihex encoded, so you genuinely needed to know the length of each argument. But now, of course, and you could write a perfectly fine client that mostly ignores argnum and arglen altogether.

I wrote a fix for the A packet for debugserver using a 'a-packet-base16' feature in qSupported to activate it, and tested it by hand, works correctly. If we're all agreed that this is how we'll request/indicate these protocol fixes, I can put up a phab etc and get this started.

debugserver-patch.txt (3.59 KB)

I was looking at lldb-platform and I noticed I implemented the A packet in it, and I was worried I might have the same radix error as lldb in there, but this code I wrote made me laugh:

     const char *p = pkt.c_str() + 1; // skip the 'A'
     std::vector<std::string> packet_contents = get_fields_from_delimited_string (p, ',');
     std::vector<std::string> inferior_arguments;
     std::string executable_filename;

     if (packet_contents.size() % 3 != 0)
     {
         log_error ("A packet received with fields that are not a multiple of 3: %s\n", pkt.c_str());
     }

     unsigned long tuples = packet_contents.size() / 3;
     for (int i = 0; i < tuples; i++)
     {
         std::string length_of_argument_str = packet_contents[i * 3];
         std::string argument_number_str = packet_contents[(i * 3) + 1];
         std::string argument = decode_asciihex (packet_contents[(i * 3) + 2].c_str());

         int len_of_argument;
         if (ascii_to_i (length_of_argument_str, 16, len_of_argument) == false)
             log_error ("Unable to parse length-of-argument field of A packet: %s in full packet %s\n",
                        length_of_argument_str.c_str(), pkt.c_str());

         int argument_number;
         if (ascii_to_i (argument_number_str, 16, argument_number) == false)
             log_error ("Unable to parse argument-number field of A packet: %s in full packet %s\n",
                        argument_number_str.c_str(), pkt.c_str());

         if (argument_number == 0)
         {
             executable_filename = argument;
         }
         inferior_arguments.push_back (argument);
     }

These A packet fields give you the name of the binary and the arguments to pass on the cmdline. My guess is at some point in the past the arguments were not asciihex encoded, so you genuinely needed to know the length of each argument. But now, of course, and you could write a perfectly fine client that mostly ignores argnum and arglen altogether.

That's quite clever, actually. I like it. :slight_smile:

I wrote a fix for the A packet for debugserver using a 'a-packet-base16' feature in qSupported to activate it, and tested it by hand, works correctly. If we're all agreed that this is how we'll request/indicate these protocol fixes, I can put up a phab etc and get this started.

I think that's fine, though possible changing the servers to just ignore the length fields, like you did above, might be even better, as then they will work fine regardless of which client they are talking to. They still should advertise their non-brokenness so that the client can form the right packet, but this will be just a formality to satisfy protocol purists (or pickier servers), and not make a functional difference.

pl

Ah, good point. Let me rework the debugserver patch and look at lldb-server. I wrote lldb-platform to spec and hadn't even noticed at the time that it was expecting (and ignoring) base 16 here when lldb was using base 10.

The only possible wrinkle I can imagine is if someone took advantage of the argnum to specify a zero-length string argument. Like they specify args 0, 1, 3, and expect the remote stub to pass an empty string as arg 2. It's weird that the packet even includes argnum tbh, I can't think of any other reason why you would do it except this.

J

Thank you all for your encouraging replies. Moritz Systems has received
funding for enabling KGDB compatibility in LLDB, and as part of this
contract we'd like to work on improving the compatibility between LLDB
and gdbserver protocol as much as time permits. I've started comparing
the LLDB's code with the protocol documentation and/or gdb sources (when
docs are lacking) for potential incompatibilities, and I should be able
to submit first patches today.

Considering your replies, I am going to focus on changing things
in place. However, if during review we decide that backwards
compatibility is desirable, I can do that as well.

We will discuss the individual changes on Phabricator.