How to test reverse-execution support in LLDB

I’m an rr maintainer and I’d like to add reverse-execution commands to LLDB (like GDB’s “reverse-continue”). rr’s gdbserver already works with LLDB and we have started testing against LLDB in rr’s CI, but to get the full benefits of rr, users need LLDB commands that perform reverse execution.

Right now I’m focused on supporting the simplest but also most useful command — plain “reverse continue”. I have a proof of concept implementation; it’s not complete (doesn’t handle watchpoints yet, and needs scrutiny), but it doesn’t seem very difficult. The hard part here is finding a way to test this. You definitely don’t want to require rr to be availble to run reverse-execution tests! I have a plan which I need feedback on.

My current idea is to implement the simplest possible reverse-execution engine on top of LLDB’s gdbserver implementations (lldb-server/debugserver). A reverse-execution test would do the following:

  1. Run a test subject normally to break at the start of a section of code we want to test reverse-execution in.
  2. Trigger “recording” of that section of code: singlestep forward until we break at the end of the code section. Between each singlestep, take a snapshot of the registers and of memory regions specified by the test (e.g. the stack VMA). The memory regions would be required to contain all memory writes performed by the code in the reverse-execution region.
  3. The test can then request reverse-execution. This generates bc and bs packets. These packets are implemented by reverse-stepping; each reverse-step pops one register+memory snapshot off the list and restores it, followed by checking whether we should generate a breakpoint or watchpoint stop.

(Note: this functionality would be impractical to use outside of testing, which is fine because making it useful outside of testing is not a goal.)

Currently I think the best way to implement this would be a gdbserver proxy. The test would provide a gdbserver socket to LLDB, and the test gdbserver would pass most packets through to an underlying lldb-server/debugserver and forward the replies. When reverse-execution is desired the test would implement step 2 by sending the right packets to the underlying lldb-server/debugserver and accumulating the snapshots. Then when LLDB sends bc/bs the test gdbserver intercepts those packets and performs step 3.

This approach seems to have several advantages:

  • Doesn’t require any changes outside of test code
  • Should be largely architecture/platform independent as long as lldb-server/debugserver exist
  • Wrapping the approach in a Python library should allow individual tests to be simple
  • Since most packets would be passed through transparently, it shouldn’t be too much work to implement
  • Can reuse existing LLDB Python test code for processing packets
  • The gdbserver protocol is pretty stable so the test framework should add minimal maintenance burden

I’d appreciate any feedback on all this!

FWIW I work for Google but mostly not on debugging. Also FWIW this work will support other reverse-execution gdbservers such as UndoDB, perhaps with tweaks.

1 Like

Maybe one day we can remove Open Projects - 🐛 LLDB :slight_smile:

If I understand correctly, this test code can use a very slow method and perhaps limit what it records to only what’s needed to show that the lldb commands work.

For instance, there might be some side effects it doesn’t track because we only need to prove that the right snapshot ID (or however these commands work) is requested.

We then trust that when used with rr, that snapshot will be accurate and stored and swapped in some efficient way. lldb makes the request, and it’s rr’s job to fulfill that correctly.

Given that these packets would not be enough to reverse debug properly, I think a proxy is better than adding half working versions of them to lldb-server itself.

GDB Remote Protocol Extensions - 🐛 LLDB might be useful for this. Existing lldb-server understands that.

For memory, perhaps you could decide to only check some known buffer. Then pass the location of that to the proxy server for it to save and restore. Or use the current stack frame and get the size by asking lldb for the function symbol’s extent.

In lldb/packages/Python/lldbsuite/test/gdbclientutils.py there is a MockGDBServerResponder class, one example use is lldb/test/API/commands/register/register/TestRegistersUnavailable.py.

Usually it just returns strings, but I don’t think anything would stop you making it into a proxy.

If I understand correctly, this test code can use a very slow method and perhaps limit what it records to only what’s needed to show that the lldb commands work.

Yes, exactly.

We then trust that when used with rr, that snapshot will be accurate and stored and swapped in some efficient way. lldb makes the request, and it’s rr’s job to fulfill that correctly.

Yes, rr is implemented completely differently.

For memory, perhaps you could decide to only check some known buffer. Then pass the location of that to the proxy server for it to save and restore. Or use the current stack frame and get the size by asking lldb for the function symbol’s extent.

For most tests it will be necessary to save/restore the stack. I assume it’s not hard to identify the stack VMA and save/restore the whole thing. Tests might also want to save/restore one or more global variables.

Thanks for the info!

I agree with David that the added packet handling should be in a proxy server that’s provided by the test case, not code added to either lldb-server or debugserver. I don’t want people finding out that our debugserver/lldb-server seem to support reverse debugging, only to be disappointed (and think we are lame) when they find out that it doesn’t in fact work.

The test should be able to manipulate the normal lldb API and the proxy server at the same time, as they’re both in Python. So you could connect to the proxy, use the API normally to find the stack, tell the proxy where it is, then commence testing.

On Linux there is a memory region marked [stack], so you could find that with SBProcess - 🐛 LLDB. Probably a better, cross platform way to do that but it’s a decent start.

(and note that we no longer disable ASLR for the vast majority of tests, but it is still an option if you want to do that)