Support for structured error messages in lldb-vscode

tl;dr can LLDB errors have more structured information?


Background: if you look at the DAP protocol for lldb-vscode, you’ll find that there’s a specific format that DAP implementations are supposed to use for error messages, the “Message” interface (despite the generic name, it seems to be only for error details).

Right now, we don’t use this, we just set the “message” field, e.g. a response might look like this:

{
 "seq":0,
 "type":"response",
 "request_seq":2,
 "success":false,
 "command":"foo",
 "message":"Some long and descriptive error message",
}

Using the specification, a more conforming error response would be:

{
 "seq":0,
 "type":"response",
 "request_seq":2,
 "success":false,
 "command":"foo",
 "message":"some_foo_error_code",
 "body": {
  "error": {
   "id":123,  // 123 is an enum value unique within LLDB
   "format":"Some long and descriptive error message.",
   "showUser":true,
   "url":"http://www.example.com/documentation", // Button link
   "urlLabel": "Open docs" // Button text
  }
 }

To the user, the only visual difference is the error popup can have an additional button to get more info. The DAP client (e.g. VSCode) could choose to customize things based on the error name/id it gets, or choose to take some other action if it gets an error but showUser is false, etc. Also, some DAP clients may be strict and choose to not show anything unless the error message is in this structured form. (FWIW, the regular VSCode binary shows both types of errors).


Anyway, bringing this back to a general LLDB issue: how can we provide extra context for errors in a way that makes it all the way to lldb-vscode? LLDB’s Status error class is just a tuple of (m_type, m_code, m_string). For example, we might bubble up (m_type = eErrorTypePOSIX, m_code = EPERM (1), m_string = "Operation not permitted") from somewhere. But EPERM can imply different things depending on where that EPERM is coming from. This was my motivation in D144904: if we see EPERM when failing to attach, we can take that as a hint that changing ptrace_scope might help, and provide a better error message that way. But we can’t generally assume that EPERM always means that, so we need to do this check at the place where we know that this issue can result as an EPERM, not at some other place higher up in LLDB.

In the patch above, the error message for ptrace_scope includes a link to https://www.kernel.org/doc/Documentation/security/Yama.txt. This is fine when presented to the user in command line LLDB, but for lldb-vscode, the link should surfaced as a button the user can click instead. So, we need some way of indicating what this error is. I’m not sure what the best approach would be, what all the requirements are, or if there are already historical ideas in this design space. Can we expand the Status struct to contain additional metadata (e.g. m_vscode_error_id =1234 or m_extra_error_info = {{"id", "1234"}, {"name", "foo_error_code"}}) that can survive across LLDB Status ↔ LLVM Error conversions or serializations across the GDB remote protocol? Is that format permanent, and we need to rely on the error string being stable to do regex matching and infer properties of the error? Is there some other solution I should consider?

1 Like

We could certainly extend the Status class to hold a StructuredData with some more bits of information. We don’t have lots of Status objects in flight at any time, so making them bigger shouldn’t be a problem.

But I think that’s not sufficient for what you are really asking for. The current model for status object is that they are produced at the error site, and then passed along mostly unmodified as the error is handled.

However, it’s quite often the case that the place where the error is produced doesn’t know why it is performing its task, its view is more narrowly constrained. For instance, none of the code that deals directly with processes knows whether it was initiated by a command line command, an SB API call, or other internal process. The error site can’t provide the information you want to see.

So we would also have to change the error paths to annotate their failed Status objects with more context on the way out. That’s not impossible, but is a lot more design than just “Add some structured data to Status”.