LLDB use of G/g packets

Hi,

I’m implementing a debugger backend that implements the gdb remote protocol and adding extensions to support LLDB also.

I’ve added support for the qRegisterInfo packet and I’ve noticed in the logs that LLDB uses the p/P packets instead of the G/g packets.

Is there a special reason why it uses multiple packets instead of one?

Maybe I didn’t implement enough “information” packets and LLDB doesn’t know how to read g/G packets? If so, what packets are missing?

Thanks,
Benjamin.

I noticed something that I’ve missed when sending the previous Email. Looking at the logs, I see LLDB sending “kill” command for no reason:

<lldb.driver.main-thread> < 18> send packet: $m1039ede00,200#86
1 <lldb.driver.main-thread> < 1> read packet: +
0 <lldb.driver.main-thread> <1028> read packet: $4b4c5752000000000000000000000000c04e0000014e0000004e000000000000000000000000000010de9e030100000014de9e030100000018de9e030100000020
1 <lldb.driver.main-thread> < 1> send packet: +
2 <lldb.driver.main-thread> < 18> send packet: $m1038d4200,200#21
3 <lldb.driver.main-thread> < 1> read packet: +
4 <lldb.driver.main-thread> <1028> read packet: $e97bf4ffff6666666666662e0f1f8400000000009090909090909090909090909090909090909090909090909090909090909090909090909090909090909090e9
5 <lldb.driver.main-thread> < 1> send packet: +
6 <lldb.driver.main-thread> < 21> send packet: $Z0,7fff5fc0d6e5,1#de
7 <lldb.driver.main-thread> < 1> read packet: +
8 <lldb.driver.main-thread> < 6> read packet: $OK#9a
9 <lldb.driver.main-thread> < 1> send packet: +
10 <lldb.driver.main-thread> < 16> send packet: $qfThreadInfo#bb
11 <lldb.driver.main-thread> < 1> read packet: +
12 <lldb.driver.main-thread> < 8> read packet: $m707#0b
13 <lldb.driver.main-thread> < 1> send packet: +
14 <lldb.driver.main-thread> < 16> send packet: $qsThreadInfo#c8
15 <lldb.driver.main-thread> < 1> read packet: +
16 <lldb.driver.main-thread> < 5> read packet: $l#6c
17 <lldb.driver.main-thread> < 1> send packet: +
18 <lldb.driver.main-thread> < 18> send packet: $z0,1005f00d7,1#5a
19 <lldb.driver.main-thread> < 1> read packet: +
20 <lldb.driver.main-thread> < 6> read packet: $OK#9a
21 <lldb.driver.main-thread> < 1> send packet: +
22 <lldb.driver.main-thread> < 21> send packet: $z0,7fff5fc0d6e5,1#fe
23 <lldb.driver.main-thread> < 1> read packet: +
24 <lldb.driver.main-thread> < 6> read packet: $OK#9a
25 <lldb.driver.main-thread> < 1> send packet: +
26 <lldb.driver.main-thread> < 5> send packet: $k#6b
27 <lldb.driver.main-thread> < 1> read packet: +
28 <lldb.driver.main-thread> < 6> read packet: $OK#9a
29 <lldb.driver.main-thread> < 1> send packet: +
30 6> send packet: $p2#a2
31 <lldb.driver.main-thread> < 1> read packet: +
32 <lldb.driver.main-thread> < 4> read packet: $#00
33 <lldb.driver.main-thread> < 1> send packet: +
34 <lldb.driver.main-thread> < 6> send packet: $p3#a3

Here are my commands in the lldb console:

➜ build lldb /bin/ls
Current executable set to ‘/bin/ls’ (x86_64).
(lldb) log enable -f /tmp/packets.txt gdb-remote packets
(lldb) process connect -p gdb-remote connect://localhost:58985
Process 1799 stopped

  • thread #1: tid = 0x0707, , stop reason = signal SIGINT
    frame #0:
    (lldb)

Why might LLDB send the ‘kill’ command without “permission”?

The weird thing is that after it sends the kill command, it then start requesting for register values using the ‘$p’ commands…

Any ideas why this is happening?

Regards,
Benjamin.

Hi,

I'm implementing a debugger backend that implements the gdb remote protocol and adding extensions to support LLDB also.

I've added support for the qRegisterInfo packet and I've noticed in the logs that LLDB uses the p/P packets instead of the G/g packets.

We normally don't need all registers, we just need a few. And all GDB servers we currently communicate with send most of these needed registers in the stop reply packets as "expedited" registers. So if you can afford to, try to expedite all registers in the stop reply packets. Why? Because the p/P and g/G packets are lame in that you have to send the "Hg<tid>" to select the current thread prior to sending them (unless you also support the LLDB thread suffix (respond with "OK" to "QThreadSuffixSupported"). So each register read/write turns into two packets that have to be sent.

Is there a special reason why it uses multiple packets instead of one?

The main reason is when we make function calls, we read the current registers, and replace the registers required for function arguments one at a time. The GDB log is much more readable when we see individual registers being modified and read instead of seeing multiple writes of all registers when only one registers is changing. We subscribe to the mode where if you write a register and then read it back again, you will see what is in the register, not just some temporary copy of the registers that will eventually be written back down to the remote target when you resume. GDB does this caching, but you often won't see that your register write failed until you run and see a weird value come back after you program executes and stops.

So the main reason is it makes the logs much clearer when debugging issues if we see the individual reads/writes.

LLDB could be made to be smarter so that if "p"/"P" is not supported, then it will fall back to using "g"/"G". The GDBRemoteRegisterContext.cpp file currently has a bool member variable called "m_read_all_at_once" which could be set to true if the the "p"/"P" packets aren't support.

Maybe I didn't implement enough "information" packets and LLDB doesn't know how to read g/G packets? If so, what packets are missing?

No, the info you have should be enough. We won't use the g/G packets unless we are going to run an expression and need to backup all registers and then restore them. Expedite as many register values as you can in your stop reply packets and in your "qThreadStopInfo" replies (if you haven't implemented this and are going to be debugging more than one thread, I highly suggest you implement the "qThreadStopInfo" packet soon).

I noticed something that I've missed when sending the previous Email. Looking at the logs, I see LLDB sending "kill" command for no reason:
<lldb.driver.main-thread> < 18> send packet: $m1039ede00,200#86
  1 <lldb.driver.main-thread> < 1> read packet: +
  0 <lldb.driver.main-thread> <1028> read packet: $4b4c5752000000000000000000000000c04e0000014e0000004e000000000000000000000000000010de9e030100000014de9e030100000018de9e030100000020
  1 <lldb.driver.main-thread> < 1> send packet: +
  2 <lldb.driver.main-thread> < 18> send packet: $m1038d4200,200#21
  3 <lldb.driver.main-thread> < 1> read packet: +
  4 <lldb.driver.main-thread> <1028> read packet: $e97bf4ffff6666666666662e0f1f8400000000009090909090909090909090909090909090909090909090909090909090909090909090909090909090909090e9
  5 <lldb.driver.main-thread> < 1> send packet: +
  6 <lldb.driver.main-thread> < 21> send packet: $Z0,7fff5fc0d6e5,1#de
  7 <lldb.driver.main-thread> < 1> read packet: +
  8 <lldb.driver.main-thread> < 6> read packet: $OK#9a
  9 <lldb.driver.main-thread> < 1> send packet: +
10 <lldb.driver.main-thread> < 16> send packet: $qfThreadInfo#bb
11 <lldb.driver.main-thread> < 1> read packet: +
12 <lldb.driver.main-thread> < 8> read packet: $m707#0b
13 <lldb.driver.main-thread> < 1> send packet: +
14 <lldb.driver.main-thread> < 16> send packet: $qsThreadInfo#c8
15 <lldb.driver.main-thread> < 1> read packet: +
16 <lldb.driver.main-thread> < 5> read packet: $l#6c
17 <lldb.driver.main-thread> < 1> send packet: +
18 <lldb.driver.main-thread> < 18> send packet: $z0,1005f00d7,1#5a
19 <lldb.driver.main-thread> < 1> read packet: +
20 <lldb.driver.main-thread> < 6> read packet: $OK#9a
21 <lldb.driver.main-thread> < 1> send packet: +
22 <lldb.driver.main-thread> < 21> send packet: $z0,7fff5fc0d6e5,1#fe
23 <lldb.driver.main-thread> < 1> read packet: +
24 <lldb.driver.main-thread> < 6> read packet: $OK#9a
25 <lldb.driver.main-thread> < 1> send packet: +
26 <lldb.driver.main-thread> < 5> send packet: $k#6b
27 <lldb.driver.main-thread> < 1> read packet: +
28 <lldb.driver.main-thread> < 6> read packet: $OK#9a
29 <lldb.driver.main-thread> < 1> send packet: +
30 6> send packet: $p2#a2
31 <lldb.driver.main-thread> < 1> read packet: +
32 <lldb.driver.main-thread> < 4> read packet: $#00
33 <lldb.driver.main-thread> < 1> send packet: +
34 <lldb.driver.main-thread> < 6> send packet: $p3#a3

Here are my commands in the lldb console:
➜ build lldb /bin/ls
Current executable set to '/bin/ls' (x86_64).
(lldb) log enable -f /tmp/packets.txt gdb-remote packets
(lldb) process connect -p gdb-remote connect://localhost:58985
Process 1799 stopped
* thread #1: tid = 0x0707, , stop reason = signal SIGINT
    frame #0:
(lldb)

Why might LLDB send the 'kill' command without "permission"?

You will need to debug the LLDB sources to see why this isn't being sent. I know of no reason why this should be happening, so this is probably a bug.

The weird thing is that after it sends the kill command, it then start requesting for register values using the '$p' commands...

Any ideas why this is happening?

No, none at all. You will need to debug this. The only place the "k" packet is sent is from inside:

ProcessGDBRemote::DoDestroy ()

There is logging that can be enabled. Add the "process" category to the "gdb-remote" logging and use the "--stack" option to print out a backtrace:

(lldb) log enable --stack -f /tmp/process.txt gdb-remote process

You should then see a stack backtrace to see who is calling DoDestroy()...

Greg,

First of all thanks for the highly informative answers!

Regarding the ‘kill’ command that was sent, I have no idea also why it happend, but somewhere during the work day (after a few hours of going through LLDB code) it stopped happening…

So now LLDB reads all the register values correctly using the `p packets, and regarding the expedited registers I believe I already pass them with the stop reply, see example:

Greg,

First of all thanks for the highly informative answers!

Regarding the 'kill' command that was sent, I have no idea also why it happend, but somewhere during the work day (after a few hours of going through LLDB code) it stopped happening...

So now LLDB reads all the register values correctly using the `p packets, and regarding the expedited registers I believe I already pass them with the stop reply, see example:
<lldb.process.gdb-remote.async> < 78> read packet: $T0b06:cce2f16cff7f0000;07:28df8e52ff7f0000;10:0000000000000000;thread:707;#15

I hoped that once LLDB will get all the registers it will be able to construct the stack frame but it fails to do so. I thought maybe I'm missing the qShlibInfoAddr so I implemented it also but that didn't help also.

Have you implemented the qHostInfo packet?

send packet: $qHostInfo#00
read packet: $cputype:16777223;cpusubtype:3;ostype:macosx;watchpoint_exceptions_received:after;vendor:apple;endian:little;ptrsize:8;#00

Maybe we just don't know the target triple (specified by "ostype", "vendor" and here the cpu type + subtype) that we are debugging? You can get around this by specifying a full triple when you make your target:

(lldb) target create --arch x86_64-apple-macosx <EXE>

But it is always better to specify it exactly through the qHostInfo if you can.

The command I didn't implement yet is "qThreadStopInfo", could this be the reason why the stack trace is not constructed or am I missing something else?

No, that wouldn't do it. It could be because you didn't fully specify your generic registers in the responses to the qRegisterInfo packets. It could be due to an unknown target triple that made no dynamic loader plug-in be selected. If you can send me a full transcript of all gdb-remote packets offline, I will take a look and see what I can figure out.

Also, to implement the "qShlibInfoAddr" packet I've used the following snippet:
struct dyld_all_image_infos* getImageInfosFromKernel
()
{
  task_dyld_info_data_t task_dyld_info;
  mach_msg_type_number_t count = TASK_DYLD_INFO_COUNT;
    
if
( task_info(mach_task_self(), TASK_DYLD_INFO, (task_info_t)&task_dyld_info, &count) ) {
    FAIL(
"all_image_infos: task_info() failed"
);
    exit(0);
  }
  
return (struct
dyld_all_image_infos*)(uintptr_t)task_dyld_info.all_image_info_addr;
}

Does it get the required address for the result packet?

Yes, but you wouldn't want to exit(0) if you failed to get the info right? Just return an invalid address.

A few questions for you: what are you trying to debug? What arch? It sounds like this is for a user space MacOSX program? The triple specified when creating the target and/or from the qHostInfo will help determine which shared libraries are loaded and where they are loaded. This will then allow LLDB to resolve "load" addresses back into "file" relative addresses so we can backtrace.

Everything works! It was the “generic:” pair that was missing… I missed it somehow when implementing the “qRegisterInfo” packet response.

I have two last questions (hopefully…):

  1. In our backend we currently place the initial breakpoint at:
  • thread #1: tid = 0x0707, 0x00007fff679ae028 dyld_dyld_start, stop reason = signal SIGINT frame #0: 0x00007fff679ae028 dyld_dyld_start

Which doesn’t seem right on OSX since when I try to continue it stops again at:

  • thread #1: tid = 0x0707, 0x00007fff679ba6e6 dyldgdb_image_notifier(dyld_image_mode, unsigned int, dyld_image_info const*) + 1, stop reason = signal SIGTRAP frame #0: 0x00007fff679ba6e6 dyldgdb_image_notifier(dyld_image_mode, unsigned int, dyld_image_info const*) + 1

And on the next continue it crashes. We have encountered something similar on Windows where it has a “safe” point from which the debugger is allowed to connect and I’m guessing that this is also the case in OSX. Where does LLDB inserts it’s first breakpoint?

  1. It seems as if LLDB doesn’t find the correct PID of the process it connects to, I always get PID 1799 which doesn’t seem right. I didn’t see any packet where this information is communicated in so how does LLDB get the target process PID?

Thanks again for all the great assistance,
Benjamin.

Everything works! It was the "generic:<reg>" pair that was missing... I missed it somehow when implementing the "qRegisterInfo" packet response.

Great!

I have two last questions (hopefully...):
1. In our backend we currently place the initial breakpoint at:
* thread #1: tid = 0x0707, 0x00007fff679ae028 dyld`_dyld_start, stop reason = signal SIGINT
    frame #0: 0x00007fff679ae028 dyld`_dyld_start

This is where all processes stop when started with posix_spawn with the extra attribute flag that waits for debugger.

You might want to switch to SIGSTOP instead of SIGINT to be just like all other MacOSX processes.

Which doesn't seem right on OSX since when I try to continue it stops again at:
* thread #1: tid = 0x0707, 0x00007fff679ba6e6 dyld`gdb_image_notifier(dyld_image_mode, unsigned int, dyld_image_info const*) + 1, stop reason = signal SIGTRAP
    frame #0: 0x00007fff679ba6e6 dyld`gdb_image_notifier(dyld_image_mode, unsigned int, dyld_image_info const*) + 1

LLDB sets a breakpoint here so it can figure out when shared libraries are loaded/unloaded. Since the offset is "+ 1", you are probably having trouble with the following sequence:

1 - run and hit breakpoint 0x00007fff679ba6e5
2 - disable BP at 0x00007fff679ba6e5
3 - single step one thread only to get past the current instruction
4 - re-enable BP at 0x00007fff679ba6e5
5 - continue

So keep a close eye on your packet log around this time. Also note we have a packet disassembler in:

lldb/scripts/disasm-gdb-remote.pl

Just run it on your packet log and you will get more human readable output:

./lldb/scripts/disasm-gdb-remote.pl packet-log.txt

And on the next continue it crashes. We have encountered something similar on Windows where it has a "safe" point from which the debugger is allowed to connect and I'm guessing that this is also the case in OSX. Where does LLDB inserts it's first breakpoint?

Our first breakpoint is the DYLD breakpoint on gdb_image_notifier and the breakpoints the users sets.

2. It seems as if LLDB doesn't find the correct PID of the process it connects to, I always get PID 1799 which doesn't seem right. I didn't see any packet where this information is communicated in so how does LLDB get the target process PID?

I don't believe we do get this correct right now, and there isn't a packet in the GDB remote protocol that gets this as far as I know. We will probably add a "qProcessInfo" packet that can returns a set of key:value pairs just like the thread stop info packets.

Let me know if the above hint helps at all. Feel free to send me a packet dump offline as I am pretty good and looking at the flow of the program from the GDB packet logs and figuring out what is going wrong.

Greg