SBProcess::GetSelectedThread() inside the breakpoint callback context

Hello,

I’m trying to write a plugin for LLDB for my use. I’m having a problem understanding how am I supposed to use SBProcess::GetSelectedThread(). When writing a plugin and using it from the command-line, everything works as expected: GetSelectedThread() returns the currently selected thread. But when LLDB runs my command from e.g. breakpoint’s callback declared with the -C option, the GetSelectedThread() seems to always return the first thread instead of the selected thread. But, at the same time, LLDB’s built-in commands work fine (they return the selected thread instead of the first thread).

I have prepared a simple proof of concept that shows what I mean in more details. This behaves the same way on Linux (lldb 13.0.0) and macOS (lldb-1316.0.9.41). It doesn’t matter if I try to write a native plugin in C++, or use Python wrappers.

First I’ve prepared a small threadtool app that creates another thread and runs some function inside this new thread:

#include <thread>
#include <cstdio>
int main() {
    std::thread t([] () {
        fopen("", "");
    });
    t.join();
    return 0;
}

The point of this small threadtool app is that I want to create a process which spawns a new thread, and I want to force LLDB to break into a non-primary thread inside a process by setting a breakpoint to “fopen” later. I’ve compiled this tool it using:

$ g++ threadtool -o threadtool

Then, my Python plugin uses a simple command that just prints current thread ID. The code is (‘plugin.py’):

import sys
import lldb
sys.path.append("/home/antek/dev/haxe/lldbscripts/bin")                        
import main
import argparse

def cmd_db(ctx, cmd, result, state):
    tgt = ctx.GetSelectedTarget()
    proc = tgt.GetProcess()
    thread = proc.GetSelectedThread()
    print("Return of GetSelectedThread() is: {}".format(thread.GetThreadID()))

Then I’m using this command line to run LLDB:

$ lldb -O "command script import plugin.py" -O "command script add -f stubs.cmd_db db" threadtool -o 'break set -n fopen -C "register read rip" -C "db" -C "thread info"' -o 'r' -o 'thread list' -o 'register read rip' -o 'db' -o 'q'                                                                                                                                                                                                                                                                                                             

To make it clear, this command will:

  1. Load the ‘plugin.py’ to LLDB,
  2. Define a new “db” that will run the code from inside the Python script,
  3. Load the “threadtool” application into LLDB
  4. Set a new breakpoint using function name trigger “fopen”, and declare on-trigger callbacks:
    a) First, run “register read rip” command,
    b) Then, run the “db” command (defined in Python.py),
    c) Then, run the “thread info” command.
  5. Then issue the “r” command, so this will run the “threadtool” app.
  6. LLDB will break of the fopen() function inside the second thread, and will run all callbacks defined in step 4.
  7. Then, it will run “thread list”
  8. It will run “register read rip”
  9. It will run “db”
    A. And last, it will quit.

This is the log of an example session:

 1  (lldb) command script import src/stubs.py
 2  (lldb) command script add -f stubs.cmd_db db
 3  (lldb) target create "threadtool"
 4  Current executable set to '[...]/threadtool' (x86_64).
 5  (lldb) break set -n fopen -C "register read rip" -C "db" -C "thread info"
 6  Breakpoint 1: no locations (pending).
 7  WARNING:  Unable to resolve breakpoint to any actual locations.
 8  (lldb) r
 9  1 location added to breakpoint 1
10  (lldb)  register read rip
11       rip = 0x00007ffff7bc26e0  libc.so.6`_IO_new_fopen at iofopen.c:85:1
12  Return of GetSelectedThread() is: 14753
13  (lldb)  db
14  (lldb)  thread info
15  thread #2: tid = 14788, 0x00007ffff7bc26e0 libc.so.6`_IO_new_fopen(filename="", mode="") at iofopen.c:85:1, name = 'threadtool', stop reason = breakpoint 1.1
16  Process 14753 stopped
17  * thread #2, name = 'threadtool', stop reason = breakpoint 1.1
18      frame #0: 0x00007ffff7bc26e0 libc.so.6`_IO_new_fopen(filename="", mode="") at iofopen.c:85:1
19  Process 14753 launched: '[...]/threadtool' (x86_64)
20  (lldb) thread list
21  Process 14753 stopped
22    thread #1: tid = 14753, 0x00007ffff7bd4019 libc.so.6`__GI___futex_abstimed_wait_cancelable64 at futex-internal.c:57:12, name = 'threadtool'
23  * thread #2: tid = 14788, 0x00007ffff7bc26e0 libc.so.6`_IO_new_fopen(filename="", mode="") at iofopen.c:85:1, name = 'threadtool', stop reason = breakpoint 1.1
24  (lldb) register read rip
25       rip = 0x00007ffff7bc26e0  libc.so.6`_IO_new_fopen at iofopen.c:85:1
26  (lldb) db
27  Return of GetSelectedThread() is: 14788
28  (lldb) q

As you can see, when calling my db Python command from inside the breakpoint’s callback function, the GetSelectedThread() returns TID 14753 (line 12). But when calling the db command from normal REPL context, it returns the proper value of 14788 (line 27).

I’m calling the second TID “proper”, because built-in LLDB functions, as can be seen in lines 11 and 15, actually already use TID 14788. So the built-in functions behave the same way when they’re called from the breakpoint callback context, and from REPL context. But, from some reason GetSelectedThread() from the public API returns the wrong thread when called from the breakpoint context.

Is this expected and am I using the API in a wrong way, or is it a bug?

If I’m using the API in a wrong way, what would be the preferred way of getting the proper thread ID in this situation? One idea I’m having would be to simply enumerate all threads, and find the first thread with the stop reason different than lldb::StopReason::eStopReasonNone (it seems to work after a quick test). Would that be a good approach?

The initial form of the lldb command interface for python is good enough for commands run on the command line, since command-line execution sets all the “currently selected” entities before running the command. But it isn’t sufficient for more complex situations, e.g. for commands run in breakpoint callbacks.

It is not uncommon - particularly when you have a breakpoint on code run in a function that’s active on many threads - to have hit the breakpoint on multiple threads by the time control is returned to the debugger. lldb needs to operate on all the threads that hit breakpoints at this stop, so it needs to call your command for each thread, and each time, your command needs to know which thread it is currently being asked to operate on. We could have done this by setting the “selected” entities and then running the command, repeating for all the stopped threads, but monkeying with global state like that is not a great design.

So instead (mirroring what happens with built-in commands) we added a second form of the command callback:

command_function(debugger, command, exe_ctx, result, internal_dict)

that takes the additional SBExecutionContext argument exe_ctx. That will be filled with the target/process/thread/frame (if available) that the command should use. So for instance, when running a breakpoint callback, lldb will put the thread that hit the breakpoint in exe_ctx and pass that to your command. If you want your command to behave correctly in breakpoint callbacks you should use the second form and pull the process/thread/frame you want to work on from the passed-in exe_ctx. And of course, when your command is run directly from the command line, the exe_ctx will hold the currently selected target/process/thread/frame, so it is still correct to pull from there.

Jim

Really nice, thanks for the answer. Indeed it seems to work.

I’ve tried to compare how the built-in commands work, and I’ve seen use of the exe_ctx everywhere. Is there an easy way of obtaining this context in C++ API when writing commands using SBCommandPluginInterface?

The SBExecutionContext represents some execution state of a debugee. For instance, if you have a target that’s not yet running, it’s execution context will only have a target, the other fields will be invalid. If you have a running process, it will have a target & a process, but no thread or stack frame, etc. So you are always making an SBExecutionContext that represents some SBTarget/SBProcess/SBThread/SBFrame you are interested in. Do that by passing that SB entity to the SBExecutionContext’s constructor. So to get the SBExecutionContext for the an interesting thread:

exe_ctx = lldb.SBExecutionContext(my_interesting_thread)

That will fill in the target, process and thread.

Jim

I’ll rephrase my second question. I’m sorry if I’m missing something.

As you’ve written in post #2, the Python API for adding new commands has been extended by adding a new exe_ctx parameter which holds information about the proper SBThread inside a SBExecutionContext. This SBExecutionContext is filled by LLDB itself, and passed to the plugin. This works and fully resolves my problem when I write plugins in Python, because I have access to the proper SBThread when my command runs from a breakpoint callback.

But when I write plugins using C++ API, I don’t see the method of acquiring a similar SBExecutionContext that is passed by LLDB to my plugin, and which is already filled with the proper SBThread. When using the SBCommandPluginInterface framework, the data I’m getting from LLDB are:

bool DoExecute(lldb::SBDebugger debugger, char **commands, lldb::SBCommandReturnObject& result) override {
    auto thread = debugger.GetSelectedTarget().GetProcess().GetSelectedThread();
    result.Printf("Thread ID: %llu\n", thread.GetThreadID());
    return true;
}

Are there any plans to extend this signature to support passing the SBExecutionContext from LLDB like in the matching Python API?

If not, are there any other methods – in C++ API – of acquiring the proper SBThread that is valid in the context of the breakpoint callback function, without any dirty hacks based on using private symbols (or without other dirty hacks, like simply calling “thread info” and parsing the current thread by a regexp)?

The C++ interface for defining commands doesn’t get used very much, and it looks like we didn’t add a new SBCommandPluginInterface variant that takes an execution context when the same work was done for Python. That shouldn’t be too hard, but will have to be done. If you have the time and inclination to work up a patch, please do. Otherwise file a bug with the GitHub issues for lldb, and somebody else will do that when they get a chance.

Jim

Thank you for being a very big help. I’ll try to see if I’ll be able to create a patch. If I won’t be able to, then I’ll create a GH issue. Cheers!

We try to keep our SB API’s binary stable, so you’ll need to keep the old SBCommandPluginInterface class intact.

Probably the easiest thing to do is make a new SBCommandPluginInterfaceWithContext with a “DoExecute” that takes the SBExecutionContext. Add parallel API’s to add commands using that class. Then for ease of implementation switch all the code internal to lldb over to using the WithContext version. To handle the old form, make a subclass of SBCommandPluginInterfaceWithContext that holds an SBCommandPluginInterface object, and a DoExecute that dispatches to the old interface object’s DoExecute, discarding the ExecutionContext. Then in the SBCommandInterpreter::AddCommand that takes the old form, instantiate one of the wrapper class objects and register that instead. That should minimize the changes.

Good luck!

Jim

1 Like

Do you think this is a proper starting point? It’s actually what you’ve written in post #8, but I’m not sure I got the details right.

Especially I’m wondering about:

  • If I’m properly adapting ExecutionContext to SBExecutionContext in SBCommandInterpreter.cpp:71 (I’ve taken this approach from ScriptInterpreterPython.cpp:2908),
  • I’m not sure if this change should bump API minor version, because it actually adds an API?
  • Now that I look at it, there are some formatting issues and a missing LLDB_INSTRUMENT_VA, which I’m also not sure is important, I’ll fix it if you think I should proceed,
  • Should I design some unit tests for this change?

https://github.com/antekone/llvm-project/commit/0488b5fdb08141c8c3f626606d814d1dfcb3fb00

If you think I can proceed with creating a review at reviews.llvm.org, then I’ll polish this PR a bit more and will continue with this on Phabricator.