Debugging interactions between kernel space and user space processes can be a difficult task. However, for many kernel engineers and low-level programmers, this workflow could greatly improve their debugging experience.
Thanks to the new Scripted Process infrastructure, we can solve this issue and make interactive co-debugging between a kernel process and multiple user-space processes a reality.
Before diving into the approach we’re considering, here is a quick reminder of what a Scripted Process is.
Scripted Process Refresher
As you may know, LLDB has many scriptable extension points where users can provide a python class or function to enhance their debugging session. Some examples of that are Data Formatters, Summary Providers and Stackframe Recognizers.
OS Threads Plugin are another good example of scripting affordance in LLDB that lets the user create threads using a python script file on a stopped process. Scripted Processes extends that capability even further.
Scripted Processes allow the user to synthesize a process in LLDB from a script file. To achieve that, they rely on 2 components:
The script file
It’s a user-provided python script that implements a class derived from the base Scripted Process class.
That class should conform to the Scripted Process Interface and implement the various methods required to instantiate the process in LLDB. The class should provide a way to simulate a launch and/or an attach, read memory, fetch the thread information and get the list of loaded images.
Similarly, a Scripted Process can create Scripted Thread objects. They will describe each thread (TID, name, state, stop reason) and provide the register context. Scripted Threads works similarly to the OS Threads Plugin.
The Scripted Process process plugin
The ScriptedProcess Process plugin is actually part of LLDB. It will be calling into the script file and reconstruct the process to allow the user to interact with it like any other process in LLDB, to see sources, unwind the stack frame, inspect variables and so on.
Scripted processes can already be loaded into LLDB as standalone processes. For now the feature is limited to a static view of the process, e.g. in the context of a core file, crashlog or process that’s halted and cannot be resumed. However, we think we can greatly improve the debugging experience with scripted processes by tying them with another real process that will drive their execution.
Proposed Approach
Our goal here is to have the kernel process provide a view of the user-space processes running on the system, while debugging the kernel, and to be able to interact with them as if they were real processes (i.e. step a thread, set a breakpoint, continue, etc).
+---------------------------------------+
| Scripted Processes |
| +------------------------+ |
| +--->| User-space Process 1 | |
| | +------------------------+ |
| | |
+------------------+ +------------+ | | +------------------------+ |
| Kernel Process |<------>| Debugger |<--+-----+--->| User-space Process 2 | |
+------------------+ +------------+ | | +------------------------+ |
| | |
| | +------------------------+ |
| +--->| User-space Process 3 | |
| +------------------------+ |
| |
+---------------------------------------+
As mentioned earlier, in order to have interactive debugging capabilities in scripted processes, their execution will be handled by a driving process (the kernel process in our previous example).
But first, before even creating the user-space scripted processes, we need a way to fetch the list of running processes in the kernel to be able to choose which one we will attach to. For that, we will introduce a new Scripted Platform that will be in charge of listing and creating the user-space Scripted Processes. Similarly to Scripted Processes, this platform needs to have user-defined python callbacks to list and create the scripted process, since there could be different ways to perform these tasks depending on the nature of the underlying system.
To create the user-space scripted processes, the user will need to interrupt the driving process execution. From there, the user will be able to request the process list from the platform and have LLDB read the driving process memory to fetch the metadata required to synthesize the user-space scripted processes. Once all the scripted processes are created, we can try to set a breakpoint on one of them and continue the execution.
This already causes some problems because the user-space process might be using virtual memory instead of the kernel physical memory , which would imply that they are on different address spaces. To be able to write that trap instruction for the user-space process breakpoint site in the kernel memory, we need an address translation layer (ATL) between the user-space and kernel-space processes. Because we cannot anticipate the heuristic used in the ATL, it needs to be a pluggable component and user-defined, so we also need to expose a new python API for this purpose.
+------------+
+---------------->| Debugger |<------------------------------+
| +------------+ v
| +---------------------------------------+
| | Scripted Processes |
| | +------------------------+ |
| | +--->| User-space Process 1 | |
| | | +------------------------+ |
v +---------------+ | | |
+------------------+ | Address | | | +------------------------+ |
| Kernel Process +----->| Translation |<-----+-----+--->| User-space Process 2 | |
+------------------+ | Layer (ATL) | | | +------------------------+ |
+---------------+ | | |
| | +------------------------+ |
| +--->| User-space Process 3 | |
| +------------------------+ |
| |
+---------------------------------------+
If we take a closer look, every scripted process that is wrapping the user-space process will be doing the same thing:
- The scripted process will read memory from the driving process
- The ATL will do the address translation and perform the read bytes
- With that memory, the scripted process will construct the expected object for that specific call and return it to LLDB
The user-space scripted processes essentially act as a passthrough process, which implies that they could all share the same python implementation. But how can the ATL distinguish which process should map to which address space ? We can take advantage of the scripted process instantiation in the Scripted Platform to assign each process with a unique identifier. Then, by embedding the ATL in the Scripted Platform, we should be able to match the scripted process ID to a specific address space when performing memory reads.
That means however that every actions performed on the Scripted Process need to the pass through the Scripted Platform.
That’s not an issue: we can create a new C++ process plugin derived from the Scripted Process Plugin. This new passthrough class should have a reference to the Scripted Platform and a reference to its platform identifier (passed in the launch info dictionary). Then, for every call to that Passthrough Scripted Process, the process plugin will actually pass down the call the Scripted Platform including the process identifier. This suggests that the Scripted Platform should also hold a reference / conform to the Scripted Process / Thread Interface.
+-----------------------------+
| Passthrough |
+--->| Scripted Process Plugin |
| +--------------+--------------+
| |
| |
| v
+------------+ | +-----------------------------+ +---------------------------------------+
| Debugger |<---+ | | | Scripted Processes |
+------------+ | | Scripted Platform | | +------------------------+ |
^ | | + | | +--->| User-space Process 1 | |
| | | Address Traslation Layer | | | +------------------------+ |
| | | | | | |
| | | --------------------------- | | | +------------------------+ |
| +--->| List/Create Processes | <--+-----+--->| User-space Process 2 | |
| | --------------------------- | | | +------------------------+ |
v | Scripted Process Interface | | | |
+------------------+ | --------------------------- | | | +------------------------+ |
| Kernel Process | | Scripted Thread Interface | | +--->| User-space Process 3 | |
+------------------+ | --------------------------- | | +------------------------+ |
| | | |
+-----------------------------+ +---------------------------------------+
This design has the advantage that all the scripted process python implementation will be the same: they should all read memory from the driving process passing through the ATL and return the expected return object to the Scripted Platform.
Once all of that is implemented, we should be able set a breakpoint and continue the user-space process. But, what would happen when the trap is hit ? The exception is handled in LLDB’s debug agent and is reported to LLDB as a public stop event.
However, in our previous example, because the execution is only driven by the kernel process, the stop event will be sent to the kernel process instead of the user-space scripted process.
The main issue here is that only one process can generate events, the ProcessGDBRemote talking to the kernel. However those events could be for any one of the user-space scripted processes, or they could be for the kernel directly. So we need something that will inspect the stop events and dispatch them to the right process.
To address that, we will implement a stop event Multiplexer that will stand between the driving process, the scripted processes and the debugger listeners. When initiating co-debugging, the driving process would forward all its events to the multiplexer. The multiplexer will be able to call into a user-provided python function that will determine which process should the stop event be forwarded to. The Scripted Platform is already conveniently located between the driving process and the scripted processes so it will also implement the multiplexer.
+-----------------------------+
| Passthrough |
+--->| Scripted Process Plugin |
| +--------------+--------------+
| |
| |
+------------+ | v
| Debugger |<---+ +-----------------------------+ +---------------------------------------+
+------------+ | | Scripted Platform | | Scripted Processes |
| | + | | +------------------------+ |
| | Address Traslation Layer | | +--->| User-space Process 1 | |
+--->| + | | | +------------------------+ |
| Multiplexer | | | |
| | | | +------------------------+ |
+--------------->| --------------------------- | <--+-----+--->| User-space Process 2 | |
| | List/Create Processes | | | +------------------------+ |
| | --------------------------- | | | |
+--------+---------+ | Scripted Process Interface | | | +------------------------+ |
| Kernel Process | | --------------------------- | | +--->| User-space Process 3 | |
+------------------+ | Scripted Thread Interface | | +------------------------+ |
| --------------------------- | | |
+-----------------------------+ +---------------------------------------+
Now, we have the driving process on one side sending events to the multiplexer that will dispatch them to user-space scripted processes on the other side. But what should we do with the driving process stop events that are also forwarded to the multiplexer ? Because we don’t want to alter the driving process plugin implementation to have it listen and handle the events that it has just sent to the multiplexer, we will also wrap the driving process in a pass-through scripted process.
That will keep us from using another side channel (like SBAPI) to fetch the driving process metadata and like for the user-space scripted processes, the wrapped driving process will listen for stop events from the multiplexer and handle them by itself.
+-----------------------------+
| Passthrough |
+--->| Scripted Process Plugin |
| +--------------+--------------+ +---------------------------------------+
| | | Scripted Processes |
| | | +------------------+ |
+------------+ | v | +--->| Kernel Process | |
| Debugger |<---+ +-----------------------------+ | | +------------------+ |
+------------+ | | Scripted Platform | | | |
| | + | | | +------------------------+ |
| | Address Traslation Layer | | +--->| User-space Process 1 | |
+--->| + | | | +------------------------+ |
| Multiplexer | | | |
| | | | +------------------------+ |
+--------------->| --------------------------- | <--+-----+--->| User-space Process 2 | |
| | List/Create Processes | | | +------------------------+ |
| | --------------------------- | | | |
+--------+---------+ | Scripted Process Interface | | | +------------------------+ |
| Kernel Process | | --------------------------- | | +--->| User-space Process 3 | |
+------------------+ | Scripted Thread Interface | | +------------------------+ |
| --------------------------- | | |
+-----------------------------+ +---------------------------------------+
Finally, to avoid some confusion to the user, we want to introduce the notion of “hidden targets”, where some targets can be hidden in lldb, when listing all targets. Of course, the user should be able to list them all with a flag -a|--all
or with a boolean setting target.always_show_hidden
. This will be used to hide the real driving process target, to avoid any confusion between that and the wrapped one, but also to prevent the user from tempering with it.
This could even be used to only debug user-space scripted processes interactively, as if they were real standalone processes:
+-----------------------------+
| Passthrough |
+--->| Scripted Process Plugin |
| +--------------+--------------+ +---------------------------------------+
| | | Scripted Processes |
| | | +- - - - - - - - - + |
+------------+ | v | +---> Kernel Process |
| Debugger |<---+ +-----------------------------+ | | +- - - - - - - - - + |
+------------+ | | Scripted Platform | | | |
| | + | | | +------------------------+ |
| | Address Traslation Layer | | +--->| User-space Process 1 | |
+--->| + | | | +------------------------+ |
| Multiplexer | | | |
| | | | +------------------------+ |
+--------------->| --------------------------- | <--+-----+--->| User-space Process 2 | |
| | List/Create Processes | | | +------------------------+ |
| | --------------------------- | | | |
+- - - - + - - - - + | Scripted Process Interface | | | +------------------------+ |
Kernel Process | --------------------------- | | +--->| User-space Process 3 | |
+- - - - - - - - - + | Scripted Thread Interface | | +------------------------+ |
| --------------------------- | | |
+-----------------------------+ +---------------------------------------+
Implementation Breakdown
As presented previously, many new concepts and improvements still need to be added to LLDB to achieve interactive co-debugging. The proposed approach already breaks down the work in various parts, but in order test that, here is the list of tasks that need to be implemented:
- We need to ensure that a Scripted Process can act as a pass-through for another real process. It needs to behave like the real process (be able to set breakpoints, continue, step…) and broadcast stop events to the debugger listeners. That will be used to wrap the driving process.
- Implement the Scripted Platform to list and create scripted processes.
- Implement the multiplexer, in the Scripted Platform, listening to the driving process stop events and broadcasting them to the passthrough and debugger’s event listener.
- We need to introduce some filtering logic in the multiplexer for the stop events. This will be used to separate between kernel process events and user-space process events.
- Introduce the address translation layer in the multiplexer. This is necessary when reading/writing memory to translate addresses from virtual memory for user-space processes to kernel memory (and vice-versa?)
- Add support for multiple scripted process debugging with address translation and breakpoint support. This should get us close to a real life scenario where a user would co-debug a kernel process while interacting with user-space processes at the same time.
- We need to make sure that the multiplexer is able to coordinate multiple process events and keep everything in sync.
Request for comments
Before moving forward with this design, I would like to get the community’s input.
What do you think about this approach? Any feedback would be greatly appreciated.
Thanks!