LLDB single-stepping problem on remote debugging

Hi LLDB devs,

I’m porting LLDB to work with our existing simulator(has a GDB stub, remote debugging, embedded system).

Refering to my previous problems about loading binary to remote simulator, it has been solved. The problem was mainly on our compiler which did not generate an executable ELF but just a relocatable file, so it didn’t have any (PT_LOAD)segments. And the compiler didn’t generate proper DWARF sections as well. Thanks for your great advice!

Now breakpoint and continue function well, but problem occurs on single stepping. When I press ‘s’ on the lldb side, the program(1 process 1 thread) won’t stop : lldb keep sending ‘s’ packet to the stub. Like this(listing all RSP packet exchange):

I found that there are several threads created by lldb:
2E00ED13@59FCAE4E.21F8C45E.png
The main process ‘lldb’ just sends a single ‘s’ packet, but other threads(seems to be ‘intern-state’) keep sending ‘s’.

I use GDB to debug my lldb. I found the location that handles single-stepping : ProcessGDBRemote::DoResume() in ProcessGDBRemote.cpp. So I set a breakpoint here, press ‘s’ in lldb and let lldb to be stopped here twice. Everytime the breakpoint get hit, I press ‘bt’ to print the call stack. The picture:

So you can see the first and second hit’s call stack differs(later hits are the same as the second hit). There is a “thread switching” between the first and second hit. The first hit is in the main lldb process and the later hits happens in another thread.

Actually I have found the “stuck point” : lldb get stuck in the while (!exit_now) loop in Process::RunPrivateStateThread in Process.cpp. It can not get out of the while (!exit_now) loop so keep sending ‘s’ packet. I have also found that after the main lldb process finishes sending the first ‘s’, it returns to IOHandlerEditline::Run() in IOHandler.cpp, going into ‘GetLine’ to wait for the next lldb command to be handled, seems to be nice but other threads keep sending ‘s’ at the same time…

My questions:

  1. Why is this happening? Why the lldb keep sending ‘s’, any solutions or hints? (Now when remote program gets stopped, my lldb can’t return to “normal” : when I press lldb command there is no “(lldb)” showing on the left. You can see that from the above pictures, like this : 3106EE0F@130DAC6F.21F8C45E.png, ignoring the frame which I haven’t implemented yet. Perhaps this is the potential reason?)
    *2. A single-stepping question that I have been interested in since I started working on debug system but not related to the above problem : what’s the implement difference between command ‘n’(going through functions) and ‘s’(going into functions) while there is only one RSP single-step packet named ‘s’?

Kind regards,
Rui

I believe, as you suspected, that because LLDB doesn't know anything about your registers, that stuff is falling down. The first thing you need to do is make sure LLDB knows about your registers. LLDB will send some packets to detect registers from the GDB server and you must respond. So as soon as you stop the first time you are connected, you will want to do a "register read" command to see all of the registers and make sure that you see all of the registers. LLDB also needs some additional information about the registers like which registers in your architecture maps to generic registers like PC, SP, FP, RA (return address reg if you have one), ARG1-ARG8 (first-eighth arg for an ABI compliant function call).

The register information is discovered on the first stop by calling ProcessGDBRemote::BuildDynamicRegisterInfo() in ProcessGDBRemote.cpp.

This function will try 3 ways:
1 - load a target definition file if one is specified in the GDB remote LLDB settings
2 - load the target.xml file through the GDB server (Target Description Format (Debugging with GDB))
3 - call a LLDB custom register detection function once for each register using a custom "qRegisterInfo%u" packet where %u is a zero based register number.

For solution #1, you define a python file that contains a full definition of your registers and then set a setting in LLDB prior to attaching:

(lldb) settings set plugin.process.gdb-remote.target-definition-file /path/to/my_target_definition.py

Example files can be found in:

lldb/examples/python/*_target_definition.py

For solution #2, you will need to follow the GDB server method and return XML the describes your registers. You will also want to fill in the extra information about each register in XML. See the function:

ParseRegisters(XMLNode feature_node, GdbServerTargetInfo &target_info,
                    GDBRemoteDynamicRegisterInfo &dyn_reg_info, ABISP abi_sp,
                    uint32_t &cur_reg_num, uint32_t &reg_offset);

in ProcessGDBRemote.cpp around line 4290. Make sure to fill in the register numbering info key/value pairs:
  "gcc_regnum" which is the register number as the compiler uses
  "dwarf_regnum" which is the register number as registers are known in DWARF debug info (usually the same as "gcc_regnum", but some architectures have different numbers)
  "generic" whose value is a string that is one of: "pc", "sp", "fp", "ra" or "lr", "flags", "arg1", "arg2", "arg3" etc

For solution #3, you return a GDB packet for each register. LLDB will call your stub if neither #1 or #2 were used with a series of qRegisterInfo packets like:

qRegisterInfo0
<response for qRegisterInfo0>
qRegisterInfo1
<response for qRegisterInfo1>
...
Until you return an error. Details for the format of this packet are in:

lldb/docs/lldb-gdb-remote.txt

Search for "qRegisterInfo" and you will find a complete example of packets that are send and all of the possible key/value pairs you can respond with.

Once you have correct registers that LLDB can dynamically build a register context with, you will be able to proceed with stepping and expect to have better results.

Greg