Host vs. HostInfo

I’ve had some questions (both privately and in responses to other messages on-list) about Host and HostInfo. So I’ll explain here in hopes of answering for everyone.

First, the rationale: Host was getting too big and was starting to turn into something like “well, I need to call a platform-specific API, I’ll make it a static method in Host.cpp”. As the platform matrix grows, so does the complexity of managing this file. Second, there was no attempt in Host.cpp to group logically similar methods together in classes. There were filesystem methods, process spawning methods, thread manipulation methods, methods to query the value of various os magic numbers, etc. As I started thinking about what will need to happen to support various things on Windows, I imagined this file exploding in complexity. Even with HostWindows.cpp, you can see from looking at Host.cpp that it’s not always possible or easy to separate platform-specific logic into the platform specific Host files.

So this refactor attempts to address all of these issues.

So far I’ve been focused on (and mostly completed) moving code from Host into two different classes: Filesystem and HostInfo.

HostInfo - answers queries about the operating system that LLDB is running on. Think of this class as being “const”. It doesn’t modify your OS. If you want to know how much memory is available, or the page size, or the path to lldb.exe, you ask HostInfo. Instead of #include “Host.h” and writing Host::method(), you #include “HostInfo.h” and write HostInfo::method(). When adding new methods, put your method in the least-derived class possible where it makes sense and will compile on all corresponding platforms.

The advantage to this approach is that
a) No matter what host OS you’re on, you always have all the functionality of that host OS available to you through static binding (e.g. no casting to a derived type)
b) Almost zero pre-processor complexity

FileSystem - Has methods like MakeDirectory, RemoveDirectory, GetPermissions, etc. Where before you would #include “Host.h” and write Host::MakeDirectory(), now you #include “FileSystem.h” and write FileSystem::MakeDirectory(…).

Remaining work to be done:

  1. Nuke DynamicLibrary and use LLVM’s
  2. Make a HostProcess instantiatable, non-static class, which represents a process which is running on the Host OS. Move code from Host.cpp over there.
  3. Make a HostThread instantiatable, non-static class, which represents a thread inside of a process on the Host OS. Move code from Host.cpp over there.
  4. Make a HostProcessLauncher class, of which derived implementations would be WindowsProcessLauncher, PosixSpawnProcessLauncher, XpcProcessLauncher, etc. Move code from Host.cpp over there.
  5. Update Process plugins to use the appropriate HostProcessLauncher classes
  6. Delete Host.cpp, as there will be no code left in it anymore.

As a final cleanup to these changes, can we get get rid of the preprocessor macro in the HostInfo::GetLLDBPath() and change over to using std::once, or switch to using a static HostInfo::Initialize()?

I actually did switch over to using a HostInfo::Initialize() as per our discussion the other day, I just didn’t think to use for that for GetLLDBPath(). It’s a good suggestion though, so I’ll add it to my list of things to do. Feel free to remind me if enough time passes and I still haven’t gotten to it.

Hey Zachary,

On this part:

  1. Make a HostThread instantiatable, non-static class, which represents a thread inside of a process on the Host OS. Move code from Host.cpp over there.

We have a lower-level NativeThreadProtocol and NativeProcessProtocol concept that we’re developing. They are low-level pieces that can be instantiated for supported platforms (currently Linux x86_64, soon others, including eventually Apple-specific when they go to llgs). Right now lldb-gdbserver (llgs) is the user of these. If we were going to keep local Linux debugging with ProcessLinux/ProcessMonitor, those would be rewritten in terms of NativeProcessProtocol/NativeThreadProtocol/NativeRegisterContext, but as it stands we’ll be deprecating those when we have local Linux debugging through llgs running on all the Linux variants that the other approach currently supports.

Just wanted to mention this since it seems like there may be some cross-talk over that area you mentioned in #3.

Note NativeThreadProtocol is a lower-level concept than Thread and likewise for NativeProcessProtocol and Process. Thread and Process intertwine with some heavier, higher-level lldb concepts like thread plans, public/private state, etc. The Native* classes are very low level and don’t perform all the services that LLDB would use all by itself - it will build on top of that.

Right now the Native* classes only get used by llgs.

Greg might have more thoughts on this.

-Todd

I imagine it should all work together well. HostThread and HostProcess are basically substitutes for a tid and a pid, respectively, with convenience methods attached to them. No matter whether we’re doing local or remote debugging through llgs or anything else, someone somewhere is going to have to directly manipulate a thread or a process on their OS (even if it’s llgs), and that’s what you would use this for. Note that in the case of Windows, I’m not planning on tackling remote debugging any time soon and will start with local since it’s simpler to get something up.

Note that in the case of Windows, I’m not planning on tackling remote debugging any time soon and will start with local since it’s simpler to get something up.

Cool. The reason I brought that up is I had initially considered pushing UnixSignals down closer to a POSIX-specific level, but there are a number of places that are OS/arch-agnostic (like ProcessGDBRemote) that need to deal with these. So keeping the UnixSignals concept at the Process level seems to make more sense.

The only question there is really about naming, but the resolution to that in a non-Unix-centric way really requires figuring out whether Windows opts in to a gdb-remote-style llgs-based remote debugging support. Since that’s not on the radar, I’ll just ignore that for now.

Yea the signals stuff is tricky because if you’re on Windows remote debugging a Linux machine, you should be able to send a signal to the process, even though signals aren’t a concept on Windows. So I think UnixSignals handles that nicely. Although that is in the Target layer, which is platform agnostic by definition, so it shouldn’t be a problem. On the other hand, I don’t anticipate that the HostProcess class implemented in Host/windows/HostProcess.cpp will have a method called Signal(), for example, because that represents a process running locally on a Windows OS somewhere, where the concept doesn’t exist.

This is another example of where the preprocessor check should be done at a different level. The person interfacing with the Host layer should know what Host they’re interfacing with, otherwise they’ll do things that don’t make sense. So at least in my ideal world, the Host layer implements exactly the set of functionality it supports, and in cases where things differ, it’s up to the person writing Host::method() to make sure they’re writing something that makes sense in the context.