[RFC] Parsed commands in Python

The current Python interface makes what in lldb are called “raw commands”. LLDB does no parsing on the command line, and just passes the input string with the command part stripped off.

Most of lldb’s commands are NOT raw commands, however. They are “parsed commands”. The command specifies the options, option value types and help, and the argument types and number, and the lldb takes care of the parsing and running completions, and provides a callback to set the option values. That means native commands are more user-friendly than a Python raw command can be.

I put up a PR to illustrate this approach here:

The way the parsed command works is in three phases:

  1. Define the options and arguments:

This is usually done in the constructor of your command object, where you specify for each option it’s name, help, type, and group, and say how many arguments you accept and which kinds.

  1. Set the option value from input. This is done by calling a SetOptionValue callback in the Options class that the command subclass provides.

  2. Run the command. You are passed the command arguments after all the options are parsed, and consult your Options object to get the option values.

  3. If you have custom completions, you specify those with HandleCompletion.

Design:

I propose to handle this as the first step by registering a custom Python class that implements boththe lldb_private::Option object and the command execution.

  1. To define the command, I use python arrays of SBStructuredData::Dictionary to carry the definitions. So you implement two API’s:
    get_options_definition(options_dict_array, unused)
    get_argument_definition(arg_dict_array, unused)

The options and argument dictionaries have to have these keys (we don’t require that they only have these keys):

Options:

        short_option: one character, must be unique, not required
        long_option: no spaces, must be unique, required
        usage: a usage string for this option, will print in the command help
        required: if true, this option must be provided or the command will error out
                   groups: Which "option groups" does this option belong to.
        value_type: one of the lldb.eArgType enum values.  Some of the common arg
                    types also have default completers, which will be applied automatically.
                    completion_type: currently these are values form the                                                                                 lldb.CompletionType enum, I haven't done custom completions yet.
        enum_values: An array of duples: ["element_name", "element_help"].  If                                                                                                      provided, only one of the enum elements is allowed.  The value will be 
                    the element_name for the chosen enum element as a string.

Arguments:

        argument_type: One of the lldb.eArgType enum values
        repeat: A stringified version of lldb_private::ArgumentRepetitionType
            Note: at present lldb does not do much with the ArgumentRepetitionType
            so I just copied over what we do now.  This area needs better design on
            the lldb side before I can do much useful in the bindings.
         group: This is the same group specification as the options, but lldb handles
             this much less well than the options version...
  1. Parsing the options is handled in two steps like with lldb, your Options object gets called with:
        option_parsing_starting: reset your option values to their defaults here`

Then for each option specified on the command-line, we call:

         bool set_option_value(exe_ctx, long_name, string_value)

passing you the current SBExecutionContext, the long name of the option (since short names are optional) and the new value (as a string). Return a bool if the option was set correctly. It might be better to return an SBError here so we can do error reporting better…

  1. Then to execute the command, implement a call function like:
    __call__(self, debugger, args_array, exe_ctx, result)

The args_array will have each argument as an element of the array, in the order they were specified on the command line. Other than that, this works the same way as the standard Python command execution function.

Python Implementation:

This is pretty bare-bones, and not terribly “Pythonic” to use, but that’s easier handled on the Python side. For that, I made up a "LLDBOVParser class and a ParsedCommand class to use it on the Python side. The ParsedCommand makes an instance of the LLDBOVParser to handle steps 1 & 2, and the Parsed Command routes the option definition and setting to its LLDBOVParser, and then implements the __call__ interface itself.

The option setting for LLDBOVParser is an extended version of the basic API, the full dictionary adds two elements:

        default: the initial value for this option (if it has a value)
        varname: the name of the property that gives you access to the value for
                 this value.  Defaults to the long option if not provided.

From those it can handle option_parsing_starting automatically, and will record the string values of options. Again, even though lldb has option value types, internally it doesn’t automatically convert of validate values, so I added some convenience methods to LLDBOVParser to do this for you, but those should be considered placeholders for getting lldb to automate this more on its end. If your class provides a “translate_value” that takes a lldb::eArgType and a value, the LLDBOVParser will use that to convert your value. I haven’t plumbed error handling through in the first patch, but we should pass SBErrors here so that we can do better error handling.

  1. This part is not fully fleshed out because we really should have more support for auto-generating completions in the ParsedCommand. I don’t want everyone to implement completions for common objects by hand, we should use the CommonCompleters automatically. I did add code that routes the eArgTypes to the completer that make sense in the LLDBOVParser, and for the first version we should do something as a placeholder for arguments. Perhaps as a first go, if there’s only one argument type which has a common completer, we can automatically wire that up.

If anyone else is looking for what the user side looks like, see test_commands.py.

Should we just expand the name here? It’s not like the SBAPI names are succinct, and it’s not very obvious (no reverse pun intended) what OV is.

Otherwise I like the idea, the last debugger I worked on did this in Pure Python and came out very similar including the use of __call__ for the main work. So the user side of it seems friendly enough.

With this in place, would there be any major difference between what someone could define in Python and C++? Discounting things that just don’t exist in the SBAPI that is.

About the name, I don’t think users ever need to type this, it’s all internal to the class your command inherits from. So the name can be as verbose as we want it to be…

The plan is that once I’m done you won’t be able to tell the difference between C-based Parsed commands and Python implemented ones.

I haven’t finished how completion should work yet. In lldb proper we don’t handle completions automatically very well, even though in a lot of cases we have the information to do so. There are a lot of custom “HandleCompletion” functions that just wire up completion for an argument that’s of type eArgTypeSourceFile, etc.

You see in the parsed_cmd.py file I do some work to auto-route argument types to the built-in completer for them, but that’s not really the right way to do it, we should handle that in the parser w/o requiring user intervention. So with this patch, we’re not yet done with the plan, but that will be a lot more change, and this patch is already big enough…

I also need to add a custom completion callback. You shouldn’t need this for common cases, but sometimes it comes in really handy, for instance if you do:

(lldb) break set -s foo.dylib -n f

lldb is smart enough to only look in foo.dylib to complete f.