Native windows debugging support

I’ve started experimenting with adding support to LLDB for debugging native Windows executables on Windows. So windows host, windows target. I’ve done a few little cleanup tasks here and there and fixed some low-hanging fruit, and I’d like to move onto something more meaty.

I took a look at what it would take to get “platform process list” to work. The first thing I notice is that all of the Process info objects contain the notion of a UID and GID, a concept which doesn’t really exist on Windows. An analagous concept exists, but it’s represented completely differently.

My question is: How best to abstract out this functionality? I’m still not totally clear on where I’m allowed to use platform specific types / APIs and where it needs to be platform agnostic. My first thought is to remove UID and GID from the ProcessInfo class, and replace them with a instance a “ProcessUserId” class, then provide a PosixProcessUserId and a WindowsProcessUserId, which abstracts away the differences.

Assuming this approach is logical, where is the best place for this code to go? Host or Target?

Anything else I should be aware of?

The way to think about this distinction is to ask yourself whether you would need this feature to do remote debugging from one OS to a different OS. In principle, lldb should be able to cross-debug from any OS to any other.

In this case, the way users & groups is represented is a feature of a process controlled by the debugger. So the knowledge of how that works can't live in Host, it has to live in Target. Note also that while Host code is only required to build on the platform that you are running lldb on, features of a process have to build on all architectures, again to support cross debugging.

Jim

And really, this is more a feature of the Platform than the Target, since you might want it to do things like properly display process info when doing "platform process list".

It does look like the UserID is exposed through the SB API's. We can't really remove SB API's - at least not till we do a grand review and declare SB API version 2 - since we don't know who is using them (though in this case I am pretty sure Xcode IS using this API...) It is probably okay to have this return an error on Windows, and then add a better API using your new class.

BTW, ProcessUserID isn't a great name, since it makes it sound like it doesn't also include the GroupID. Isn't the joke "There are only two hard things in CS, cache invalidation, naming things and off by one errors?"

Jim

Can we just make a UserID class and GroupID class which contains both an integer and a string (and anything else that is required) and use that? What does windows have as far as UID and GID goes? Will strings suffice? Does it need more? As you can tell we are very unix centric right now, but we do want to abstract. So I would go the route of making a UserID and a GroupID class, everyone uses these classes and these classes need to be able to store everything that all of our classes require. My simple guess would be:

class UserID {
   std::string m_name;
   lldb::user_id_t m_identifier;
};

Same for kind of thing for GroupID.

Greg

So using the UID and GID example, these concepts are not relevant for any windows target. However, knowledge of them is actually embedded into the command options. For example, you can write “platform process list -U ”. Running “platform process -U” would therefore not make sense on a Windows host with no target, or on a non-Windows host remote-debugging a Windows target.

Are there any examples of commands which accept different command options depending on the target platform? If not, are there any objections to me adding this kind of functionality?

We’ll probably end up needing to support something like this:
https://en.wikipedia.org/wiki/Security_Identifier

Most of the access control on Windows ends up revolving around those.

I’m not sure why I didn’t get the original email from Greg here (?) I’m only seeing this copied in Todd’s response.

In any case, Windows’ concept of a user id and group id is a Security Identifier, as Todd mentioned (usually called a SID). Generally though I don’t think it’s necessary to pass around the SID, because the target can figure out the SID given a username.

Still, it feels a little awkward having a single class store everything that might be needed for any platform. Normally I’d expect to use polymorphic types in this scenario. A NativeProcessLinux, for example, could freely cast a UserId to a PosixUserId, and a NativeWindowsProcess could cast a UserId to a WindowsUserId.

There’s also the issue of command options, as I mentioned in the response to Jim. Basically, “platform process list” doesn’t even need a -U option on Windows. Or maybe even a stronger statement, it shouldn’t have a -U option. It’s easy to come up with scenarios where platforms differ significantly enough that the same set of options don’t even make sense, or where a certain platform provides sets of functionality not available on other platforms. For those cases, it would be nice for the debugger commands to be tailored to that specific platform.

So maybe my first task should be to work on the command options system a little bit to enable this type of abstraction. Thoughts?

I'm not sure why I didn't get the original email from Greg here (?) I'm only seeing this copied in Todd's response.

In any case, Windows' concept of a user id and group id is a Security Identifier, as Todd mentioned (usually called a SID). Generally though I don't think it's necessary to pass around the SID, because the target can figure out the SID given a username.

So sounds like a username could be used for both windows and unix then for now?

Still, it feels a little awkward having a single class store everything that might be needed for any platform. Normally I'd expect to use polymorphic types in this scenario. A NativeProcessLinux, for example, could freely cast a UserId to a PosixUserId, and a NativeWindowsProcess could cast a UserId to a WindowsUserId.

The other option is to remove any --uid options and change them to "--user <string>" and the UserID classes for each platform that are polymorphic know how to translate a user entered string into a valid user name for that platform. So unix could accept:

--user 123

or

--user gclayton

But windows would only accept valid strings that represent a Security Identifier or a string based user name?

There's also the issue of command options, as I mentioned in the response to Jim. Basically, "platform process list" doesn't even need a -U option on Windows. Or maybe even a stronger statement, it shouldn't have a -U option. It's easy to come up with scenarios where platforms differ significantly enough that the same set of options don't even make sense, or where a certain platform provides sets of functionality not available on other platforms. For those cases, it would be nice for the debugger commands to be tailored to that specific platform.

So maybe my first task should be to work on the command options system a little bit to enable this type of abstraction. Thoughts?

If we abstract the UserID correctly we should be able to make sure they can construct themselves from a string + platform object correctly and avoid having to do any fancy options that appear/disappear based on which platform.

Comments?

Greg

One thing to keep in mind is that the same lldb session could be simultaneously debugging a MacOS app, and a Windows app. So if you have command options that differ based on the platform (for example), then those will switch as the user switches from one target to another. If you want to tailor options to platforms it can't be static, it will have to be determined based on the currently selected platform. As an aside, for sanities sake if we should really try not to overlap option shortnames for different things on the different platforms.

Then the way command objects work is that their options are baked into the object handed to the command interpreter when they are added to the interpreter. There's one command object for a given command name registered at startup. I'm not sure how excited I am about swapping these things in and out as the currently selected Platform/Target/Process changes. For instance, if you could very well be running a command for with TargetA currently selected in the main lldb interpreter, and at the same time the process in TargetB could hit a breakpoint and want to run some commands for its breakpoint action... It would be a pain to juggle that sort of thing... It would be much simpler to keep all the options in one command object, and have the context in which the command is running determine the validity of the commands options.

OTOH, it would not be too hard to add "option validators" that would ask an execution context to vote yea or nay on an option, and then the generic interpreter command argument parsing could consult this and reject currently invalid options. The commands do get invoked with an execution context so all the right information would be there. Or we might go the static route and mark the option table with what platform/target/process they support, though we'd have to be careful not to make the option tables annoying to read and maintain. Also we would have to do something like this because the help text is currently auto-generated from the command objects, and you'd need to mark currently invalid options in the help text output.

Jim

platform process list has a number of distinct options though. There's
uid, gid, euid, and egid. All of that can be squashed into one string on
Windows, and as such only needs one command line option (perhaps -U, for
user)

But taking a step back, let's think about other things that are unrelated
to processes. Windows has something called SEH, or Structured Exception
Handling. It's kind of like C++ exceptions, where the OS itself can raise
exceptions for anything from divide by zeros, to access violations, and
other stuff. So down the line, a useful feature for a Windows debugger
might be to allow it to trap various SEH exceptions based on the code.
This would probably necessitate an entirely new command.

There's lots of other examples too. A few examples:

* Windows has "jobs" and other platforms have "cgroups". They're similar
but the semantics differ in a number of significant ways. Controlling
these would require a slightly different command set.

* Windows supports the concept of a symbol server, a way to automatically
download and cache symbols. No analogue on other platforms from what I
understand.

* Windows has no concept of signals, so any signal debugging functionality
would need to be disabled for a windows target.

There's probably a lot of other examples. But I guess the point is that I
wonder whether trying to smash everything into the same interface is the
best solution. We might need a deeper generalization.

What about attaching a (possibly different) CommandInterpreter to each
Platform object? They could differ arbitrarily. So if TargetA hits a
breakpoint followed by TargetB, we would just go through each Target's
attached interpreter, and everything should "just work".

I have a bad feeling about this, it seems like adding some tricky bits of complexity. For instance the interpreters are not independent, a breakpoint action that did:

target select TargetB
some commands

will now have to swap command interpreters in mid-flight. You'd also have to teach the interpreters to share a Python interpreter (since you want them to shared variables, etc for cool-o cross-target debugging experiments.

We'd also have to add a layer of heuristics on the "settings set" command, since they are currently held by the interpreter, but often you would be surprised if a setting didn't apply to some new target because it didn't share the same interpreter.

I also think it would be weird to have the help text dependent on the currently selected target. That just seems like it would make things overly magical. I would like a place to go to see all the options, so I have some hope of finding that vaguely remembered one-time-it-was-useful option without having to manually create all the platforms/targets I ever made to see which one had it. I think it would be much better for the help to show all options and just indicate which ones were currently available.

Jim

Ok, I’ll think about this approach some more. It’s funny that I’m of the opposite mind when it comes to the dependent help text. As I would only ever be interested in seeing help for the selected target. Would it be ok to add an option to help (for example “help -t”) that hides options not relevant to the current target?

Ok, I'll think about this approach some more. It's funny that I'm of the opposite mind when it comes to the dependent help text. As I would only ever be interested in seeing help for the selected target. Would it be ok to add an option to help (for example "help -t") that hides options not relevant to the current target?

That seems fine to me. If we had some tool that did "extract all the help text for all the targets and organize them into something with a table of contents and an index" I probably wouldn't feel so strongly about this. But since the online help is currently the only source for this information I don't want to make it state-dependent...

Jim