RFC: SBValue Metadata Provider

GraphQL is a mature query language specification that is aimed at roughly the same domain as what I see in this RFC. I am not saying you should copy it verbatim, but IMO it would be worth looking at the feature set they had evolved after no doubt many iterations: https://graphql.org/learn/queries/

The similarities I see:

  • You want LLDB to serve as an aggregator/arbitration layer between debug metadata providers and clients, without having to understand the data being exchanged.
  • You propose to use a JSON-like data type (SBStructuredData) for information exchange.
  • You want providers to publish some sort of data schema.

GraphQL spec has all that.

In addition:

  • GraphQL queries can be parameterized,
  • GraphQL also supports updates,
  • It allows querying multiple objects (=providers in this RFC), in a single transaction,
  • Clients can request only the parts of the results they are interested in,

The providers come from some python module that gets imported into lldb (e.g. with command script import. Traditionally, we don’t auto-detect/auto-load the python object types that could be bound to the various lldb script adaptors when the module is loaded. Rather, the providers get loaded by some explicit gesture on the user’s part. The point is that you might have a .py file that has a bunch of different providers, but you only want to use one or two. To do that you need to have a way to explicitly make active a specific provider. So you would for instance, do:

(lldb) command script import lotta-providers.py
(lldb) type metadata add --kind sanity-checker --python-class lotta-providers.FirstSanityChecker <TYPES>

If you want to have a python module that auto-imports all its providers (or summaries or synthetic children, etc.) then you implement the __lldb_init_module function, which gets passed a debugger, and then do

debugger.HandleCommand(f"type metadata add --kind sanity-checker --python-class {__name__}.FirstSanityChecker <TYPES>")

Or something. This isn’t particularly troublesome, and I think it’s good to separate the module loading from the provider loading, since these do have some cost.

If you’re asking the summary provider “should I use you” you might as well just have the “GetMetadata” check some condition and do an early return. The real advantage of a “Am I valid” check is if you could ask once and cache the result. But at present type summaries are imported into the debugger, and inherited by all targets. So if you have an ASAN and a non-ASAN target in lldb at the same time, you can’t currently say “Use this provider for target ASAN but not for target non-ASAN”. If we find that having a bunch of these providers that don’t get used around is causing performance problems it would be worth making the provider trees hang off the process and then cache this “Is Useful” check. But I’m not sure I would do that till we find that there is a performance problem.

I don’t think that we’re going to have providers that produce enough and sufficiently diverse output that people are really going to need to filter it either by particular provider or by picking only some fields in the output. Moreover, if we do the kind right, then for the most part the kinds will have fairly different purposes, so joined queries don’t seem particularly relevant. So for a first implementation I think it makes more sense to just have a “get me providers for this kind” API, and let the filtering be done on the client side.

The multiplicity in this scenario is more likely to be “there are lots of variables, and each can provide metadata”. But since this API is on SBValue, you are already doing the by value filtering before calling the API, removing another complexity from this interface.

In the end, if it turns out that we can add some value by filtering in lldb, we can always add a “GetFilteredMetadata” implementation that takes a request dictionary and returns a filtered result. At that point adopting a known language for expressing filtering makes total sense.

It also means if you decide you want to also use lotta-providers.FirstSanityChecker for some other type not in the originally registered type filter, you can add it this way.

Yep. That’s exactly what I meant. I was imagining the ASan summary provider to check the presence for an ASan runtime symbol in the selected target. If the symbol isn’t present (the process isn’t using ASan) then the provider should return some kind of sentinel value than means its result is not included in results returned by GetMetadata().

Another idea.

It might be useful to have a mode where LLDB will automatically get the “sanity-checker” meta and show it if the sanity-check fails when running the expr or frame variable commands. This would provide a convenient way to see the sanity-checker failures while debugging on the command line.

We’d probably want it off by default until we have proved there aren’t perf problems or UX problems.