[RFC] Exposing the Target API lock through the SB API

JDevlieghere · March 14, 2025, 11:01pm

As I was writing the description for PR #131404 I realized this change warrants a bigger discussion which is why I’m putting together this mini RFC.

Background & Problem Statement

TestDAP_breakpointEvents.py has been failing nondeterministically and we tracked it down to a race condition in lldb-dap: Issue #131242. The problem is that lldb-dap is doing multiple SB API calls, which individually are protected by a mutex, but lack atomicity as a whole. The result is that other threads (such as the event thread) can find LLDB in an incoherent state. The problem described above is not unique to lldb-dap. Anyone using the LLDB SB API risks running into this problem.

Solutions

The most obvious solution to this problem is to add locking at the DAP level. More generally, that means saying that this is the responsibility of the SB API’s clients. In a multithreaded environment, every SB API call, or group of SB API calls, needs to be protected by a mutex. The upside of this approach is that it keeps the SB API small and makes this the user’s responsibility.

An alternative approach is to reuse the Target API lock and expose it through the SB API. The problem we’re faced with here is similar to the one the SB API is faced with when calling lldb_private methods. When I said earlier that the individual calls are protected by a mutex, I was talking about the target API lock. Various SB APIs take the Target API lock to provide thread safety. The upside of this approach is that a bunch of SB API calls are already properly protected, and you only need to lock the API lock when you want to execute multiple API calls atomically.

Goal of this RFC

I implemented the second approach in [lldb] Expose the Target API lock through the SB API by JDevlieghere · Pull Request #131404 · llvm/llvm-project · GitHub. My goal for this RFC is to (1) make sure that everyone is on board with this direction and (2) have a paper trail showing that this is the direction that clients of the SB API can rely on. As we all know, the SB API is stable so whatever we do, we’ll have to support it going forward.

Thanks,
Jonas

avogelsgesang · March 14, 2025, 11:10pm

Thanks for this writeup!

Exposing the API lock sounds good to me.

I wonder if we could somehow also make this usable in Python? I guess the most idiomatic way approach for Python would be to implement the __enter__ and __exit__ methods, such that we can write something like

   with my_target.lock():
       # Do multiple actions as an atomic block, while holding the target lock

labath · March 17, 2025, 8:16am

I have some trepidation about exposing something like this through the SB API, but this approach seems reasonable to me.

If this is going to be the way to handle SB concurrency, then I think it should be available to python as well.

ashgti · March 17, 2025, 4:45pm

This may be my lack of understanding on some parts of the lldb internals, but how thread safe is SBDebugger? Does that also need a lock accessible from the SB API?

I am thinking about the use case of the ‘cancel’ DAP request. To support that in my prototype I had moved the lldb-dap input reader into its own thread to allow the DAP session to receive events into a queue and then execute items one at a time from the queue. To support cancelling an active request, when I see a ‘cancel’ request for the currently running command I run SBDebugger.RequestInterrupt() to make a best effort attempt at interrupting the current action. This would be coming from a different thread then the one processing requests.

In lldb-dap we do access and use the SBDebugger instance from multiple threads today (the request processing thread, an event monitoring thread and a progress event monitoring thread). I don’t think we have any locks on the lldb_dap::DAP today.

JDevlieghere · March 17, 2025, 5:15pm

Most of the operation at the debugger level are thread safe, either because the underlying data structures they operate on are (e.g. OptionValue), because there’s a dedicated lock in lldb_private::debugger (e.g. the callbacks) or because it acquires the target’s API lock (e.g. SBDebugger::HandleCommand).

I think the answer to me is no, because there isn’t an equivalent to the Target API lock at the debugger level. That said, it does raise a good question: if we want to synchronize multiple SB API calls at the debugger level, we still need to synchronize using our own lock, or we could resort to using the dummy target’s API lock, which admittedly isn’t all that intuitive. If we’d like to go with the latter option, that is something we could expose at the debugger level.

This operation is thread safe and guarded by the IO handler stack mutex.

wallace · March 19, 2025, 7:49pm

+1 to the SB API solution

Topic		Replies	Views
Mutex locking in SBProcess LLDB	1	88	February 5, 2015
Threading model for Python API LLDB	3	91	August 5, 2019
Locking issues on windows LLDB	18	104	April 19, 2013
Race condition or wrong API usage? LLDB	4	114	May 9, 2012
[RFC] lldb-dap refactoring to support async operations and cancellation LLDB	14	209	March 4, 2025

[RFC] Exposing the Target API lock through the SB API

Background & Problem Statement

Solutions

Goal of this RFC

Related topics