[RFC] Adding MCP support to LLDB

Background & Motivation

In an attempt to better understand how developers use AI as part of their development, I’ve spent some time after hours playing around with Copilot in Visual Studio Code. My journey started with a lot of skepticism but with the right workflow I came to see its appeal and opportunities.

One such opportunities is using established, deterministic tools to augment the capabilities of the underlying models, as opposed to doing the opposite, where models are used to augment existing deterministic tools.

For example, I’m personally skeptical of using AI inside the debugger. I don’t see a way to reconcile that with our core principles that"the debugger never lies". However, I can see value in having the debugger serve as an input, where it can help the model figure out a bug. The key difference is the expectations of our users when they’re using the debugger versus when they’re using a AI chat bot in their IDE.

Proposal

To support the latter, I’m proposing to add support for MCP to LLDB. The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to LLMs. MCP servers expose tools that can be invoked by language models. These tools (i.e. LLDB in this case) are controlled by the model, meaning that the language model can invoke the tool automatically based on its contextual understanding and the user’s prompts. The protocol supports other operations besides invoking tools, but that’s out of scope for this RFC.

The protocol uses JSON-RPC messages to communicate between clients (i.e. the model) and the server (i.e. lldb) over stdio [1], similar to the Debug Adapter Protocol (DAP).

Design & Implementation

I don’t want to go into too much detail about the protocol itself (for that I recommend the specification and the user guide), but I do want to highlight how these tools are used, as it impacts the design.

Both Claude Desktop and Visual Studio Code let you specify your own MCP server. You do this by specifying a binary and optional command line arguments. In its simplest form, that would be something like lldb-mpc which acts as a wrapper around LLDB (again, similar to lldb-dap). This is the approach I took for my initial prototype.

Because of its standalone nature, that means that the model, either autonomously, or through instructions from the user, is in charge of creating your debug session. An example of that would be:

Launch and debug /path/to/binary. Set a breakpoint on line 10 and then step through the loop and print the value of sum after every iteration.

A strictly more powerful approach however is to allow the MCP server to tap into an existing debug session. For example, I can start debugging my binary in VS Code and run to the breakpoint in the IDE, and then ask the model to do only the stepping part.

Step through the loop and print the value of sum after every iteration.

Once the model is done, I can resume debugging normally. This approach allows me to tap into the model where it makes sense, without giving up complete control. To support this use case, my proposal is to add the MCP server to LLDB itself, rather than implementing it on top.

Prototype

I have a working prototype that implements what I’ve described above. Here’s a demo of it in action:

Here’s the corresponding draft PR: [lldb] Add MCP support to LLDB (PoC) by JDevlieghere · Pull Request #143628 · llvm/llvm-project · GitHub

From within LLDB, you launch the MCP server with the new mcp command:

(lldb) mcp start tcp://localhost:1234

This starts an MCP server that advertises one tool: lldb_command. It takes an lldb command, executes it and responds with the result.

In Claude Desktop or Visual Studio Code, you use netcat to talk over stdio and send the data over the socket to the server running under lldb.

{
  "mcpServers": {
    "tool": {
      "command": "/usr/bin/nc",
      "args": ["localhost", "1234"]
    }
  }
}

In the future, I imagine we’ll want to have a dedicated lldb-mcp binary. For example, it could do auto-discovery of active connections or launch new ones to support the “standalone mode” I described earlier where the model control the debug session from start to end.

Next Step

First and foremost I’d like to get agreement on (1) supporting MCP in LLDB and (2) the approach of doing it in-process. I’ve made a case for why I think it’s important to do it this way, but I’d love to get input from the community. I’m by no means a domain expert and it’s likely there are folks here that have more experience in this realm.

After that, I think there are two pieces of work that need to be addressed after polishing my proof-of-concept:

  1. A lot of my code is based on code in lldb-dap which should be shared and reused rather than duplicated. I would propose to create a new library in LLDB (something like Protocol) that has the base protocol for MCP and DAP, and shared support classes for things like the transport.
  2. Create an lldb-mcp binary and have some kind of discovery mechanism to find active LLDB MCP server instances. Maybe something like having a known location where sockets are created.

  1. Streamable HTTP is another transport mode. ↩︎

9 Likes

My first impression is that I would not want to switch back and forth between the lldb console and the AI’s chat UI.

1 Like

I’d be curious what kind of commands or operations we could come up with for advertising to the MCP server.

Having commands built from the existing lldb commands would certainly be interesting.

I believe there are two json-rpc implementations in the llvm-project today. One in clangd and one in mlir. The mlir one is used in multiple tools within the mlir project.

When I did some of the recent refactors on lldb-dap’s Transport helper, I have taken some notes from both of those implementations. lldb-dap is not exactly the json-rpc spec, its missing a few of the features from json-rpc and is slightly simplified, but its pretty close if you squint.

I’m not sure exactly how or where any shared code would live if we did want to propose a base implementation for the llvm-project (maybe in the llvm/ folder?), but I do wonder if this would be worth proposing, since we’re all likely to run into similar problems.

If we have another server kind, I wonder if there should be any consideration for unifying the various servers in lldb for improved caching and sharing of state, where possible. I think this is the idea behind the lldb-rpc-server RFC and maybe once that is further along this would be ‘free’ (or built-in to the lldb-rpc-server layer).

I don’t know if this would be specifically related to that, but I think it is probably worth considering if this tooling will spawn multiple servers and loading symbols from multiple libraries can use a lot of memory/cpu, especially if its repeatedly parsing the same libraries (e.g. libc or libSystem or UIKit.framework).

1 Like

My first attempt had a separate tool for every lldb command and subcommand and used the help text to document each tool (command). That proved too expensive because all that information was sent along every request and would exceed the available tokens. I then limited it to top-level commands. At that point I was curious if a single tool that takes any command would work as well, and based on my testing, it absolutely does. Given that you can do pretty much everything from the command line, that’s hugely powerful and I’m not sure we even need more tools.

That’s very interesting, especially that there are already two implementations.

Yup, hoisting the implementation into LLVM would be the natural solution and something to look into.

I think there are two kinds of “servers”. On the one hand you have the lldb-rpc-server and lldb-dap (in server mode) whose purpose is to stay alive to cache and share state. On the other hand you have the MCP and DAP “servers” that clients talk to over a certain protocol.

Maybe this is what you’re referring to, but for MCP, I intentionally chose to implement the “server” in liblldb (i.e. lldb_private). This means that you can start and stop an MCP server that’s running as part of lldb-rpc-server and lldb-dap (in server mode) and benefit from its caching. In that sense the two are orthogonal.


If we were to decide to do the same thing for DAP (i.e. move it into lldb_private, which we’ve briefly discussed) we could totally do the same thing there and use DAP to talk to an lldb instance that’s running in lldb-rpc-server (and share the caching that’s done in Xcode).

Interestingly, that is something the DAP homepage hints at:

Since it is unrealistic to assume that existing debuggers or runtimes adopt this protocol any time soon, we rather assume that an intermediary component - a so called Debug Adapter - adapts an existing debugger or runtime to the Debug Adapter Protocol.

At leas that’s my interpretation of that sentence, that they envision a future where the debugger understands the Debug (non Adapter) Protocol directly.

1 Like

to elaborate, I would want to use an llm as a way to avoid commands that I don’t have memorized, commands that are contextual and more complex, and commands that use scripting. In these cases, compared to the loop stepping example, I would be driving the logic of the debugging process. The llm wouldn’t be doing the debugging, it would be handling the lifting involved in lldb commands. A couple examples:

  • “break at all return statements in the current function”
  • “update the current breakpoint to ignore the current caller”

as these are substitutes for lldb commands, I’d want to run these from the lldb console.

I think that’s a great use case. I was curious to see if it would work with my current prototype, so I created a small test function with 5 early returns, set a breakpoint on my function on the command line and asked Claude:

I’m debugging returns.cpp. I’m stopped in the debugger (lldb) in a function and I want you to set a breakpoint on every line that has a return.

It uses the list to see the sources of the file, then used breakpoint set to set the breakpoints and finally even a breakpoint list to confirm that we had breakpoints on every line with a return. It worked on the first attempt, I didn’t play around with the prompt.

3 Likes

interesting that it solved it the way someone using an ide would: by setting individual breakpoints on each matching line.

It could see the help for break set -p, maybe it doesn’t have the concept of regular expressions?

I really like the direction you are proposing. There’s another related project GitHub - pgosar/ChatGDB: Harness the power of ChatGPT inside the GDB or LLDB debugger! which adds a prompt within LLDB, but what you are proposing is much more powerful.
Something I’ve been many people struggle with LLDB is when they come from other debuggers or languages and they try to use LLDB for the first times, it takes them some time to learn the correct commands and sometimes they never figure it out. Not only that, scripting around lldb is often too much of an effort for one-off use cases. With the MCP support, you might be able to solve both use cases besides a lot of other cases.

I love this!

3 Likes

@JDevlieghere , you should check this out for additional info on debuggers + llms: https://arxiv.org/pdf/2403.16354
In short, that paper describes how the user can questions about the source code by leveraging the debugger. And that kind of support would naturally fit in your proposal.

3 Likes

I cleaned up my prototype, added tests and moved the PR out from draft to ready-for-review: [lldb] Add Model Context Protocol (MCP) support to LLDB by JDevlieghere · Pull Request #143628 · llvm/llvm-project · GitHub. I’ve added the folks that chimed in here as reviewers but as always everyone’s encouraged to provide review feedback.

3 Likes