lldb -- architecture level question -- linux v. darwin

Dear Steve and lldb-dev,

(starting a new thread to give a better, more appropriate, thread title)

I’ve investigated a little bit more and I am coming to understand there is a big difference at the moment in how lldb initiates target processes in darwin vs linux (see the comparative ps output details below). Perhaps it is more direct on linux in order to avoid having to port the debugserver?

Any insight into plans for lldb on linux would be appreciated. I do see there are significant architectural differences at the moment, in that lldb on darwin goes through the debugserver process. I wonder if that is intended as a temporary quick fix, or done with an eye to simplification. e.g. I don’t know, but perhaps it is much simpler to ptrace directly on linux than to use a separate process? I’m pretty clueless here, and just starting to explore lldb on linux, so feel free to point me towards background threads or information I might wish to review.

Thanks everyone!

cheers,
Jason

details:

On darwin, lldb starts a debugserver child process, and debugserver in turn starts a child process to contain the debug target a.out.

UID PID PPID
bash

V
2086 42103 22537 0 0:00.07 ttyp0 0:00.16 …/lldb a.out

V
2086 42104 42103 0 0:00.01 ?? 0:00.02 /Users/jaten/pkg/llvm+lldb+ldc/llvm-2.8/tools/lldb/build/Debug/LLDB.framework/Versions/A/Resources/debugserver localhost:11358 --native-regs --setsid

V
2086 42105 42104 0 0:00.00 ?? 0:00.00 /Users/jaten/pkg/llvm+lldb+ldc/llvm-2.8/tools/lldb/build/Debug/test-lldb/a.out

On linux, at the moment, what I see in contrast is that lldb is the direct parent of the debug target (a.out here).

jaten 4532 0.0 0.3 25752 8160 pts/3 Ss 03:59 0:01 | _ /bin/bash --noediting -i
jaten 22645 0.6 0.8 148260 18256 pts/3 Sl+ 11:57 0:00 | | _ ./lldb test-lldb/a.out
jaten 22650 0.0 0.0 11728 960 pts/11 Ts+ 11:58 0:00 | | _ /home/jaten/pkg/latest-svn-llvm/llvm-build-r127600-with-lldb/Release+Debug+Profile+Asserts/bin/test-lldb/a.out

See my other reply to the previous thread on this subject.

To quickly summarize why we use debugserver on macosx:
- If we are always debugging remotely even when debugging locally, adding remote debugging for MacOSX comes for free and we don't need to test local vs remote debugging.
- Having a separate process be the parent of an inferior process helps to isolate the "lldb" binary from bad things

Thank you Greg for the detailed reply. I read it with great interest.

The main thing that got me interested in LLDB in the first place was the potential to be able to (add the capability to) JIT-up code from the command line, and inject that new code into a running process (after going through the Kaleidoscope tutorial and reading the post (quoted below) from Reid Kleckner).

To this end, I’m actually most interested in exploring the very un-remote direction, where the debugger and the debuggee are as close as poosible, even in the same process space (as would probably be necessary). The aim is to get minimum re-compile times to support JIT-based everything, for an incremental style development with JIT-support, and simultaneous access to debugger level ability to step through the just JIT-ed code. The goal would be to obtain something akin to the rapid LISP development environment with hot-swappable functions, but rather than interpreted, have it be llvm JIT-based.

But perhaps my thoughts on adding this kind of feature to LLDB are not realistic. If LLDB always needs its target process to be a separate process (how baked in is that assumption, by the way, I guess that is really a key and important question!?), then I may not be able to implement the description above, inspired by Reid Kleckners post below.

http://llvm.org/docs/DebuggingJITedCode.html

Written by Reid Kleckner

Background

Without special runtime support, debugging dynamically generated code with GDB (as well as most debuggers) can be quite painful. Debuggers generally read debug information from the object file of the code, but for JITed code, there is no such file to look for.

Depending on the architecture, this can impact the debugging experience in different ways. For example, on most 32-bit x86 architectures, you can simply compile with -fno-omit-frame-pointer for GCC and -disable-fp-elim for LLVM. When GDB creates a backtrace, it can properly unwind the stack, but the stack frames owned by JITed code have ??'s instead of the appropriate symbol name. However, on Linux x86_64 in particular, GDB relies on the DWARF call frame address (CFA) debug information to unwind the stack, so even if you compile your program to leave the frame pointer untouched, GDB will usually be unable to unwind the stack past any JITed code stack frames.

In order to communicate the necessary debug info to GDB, an interface for registering JITed code with debuggers has been designed and implemented for GDB and LLVM. At a high level, whenever LLVM generates new machine code, it also generates an object file in memory containing the debug information. LLVM then adds the object file to the global list of object files and calls a special function (__jit_debug_register_code) marked noinline that GDB knows about. When GDB attaches to a process, it puts a breakpoint in this function and loads all of the object files in the global list. When LLVM calls the registration function, GDB catches the breakpoint signal, loads the new object file from LLVM’s memory, and resumes the execution. In this way, GDB can get the necessary debug information.

At the time of this writing, LLVM only supports architectures that use ELF object files and it only generates symbols and DWARF CFA information. However, it would be easy to add more information to the object file, so we don’t need to coordinate with GDB to get better debug information.

An "in process" debugger in lldb would have to be a separate "Process" plugin from the one that we would normally use on Mac OS X or Linux to get around some bootstrapping problems, like for instance you can't ptrace yourself so the standard method of taking control of a process wouldn't work. Another example is that a breakpoint hit would be a real SIGTRAP signal in your program, so you'd have to handle the SIGTRAP in a signal handler and then generate a debugger event for it, rather than just getting the SIGTRAP through ptrace or a mach exception or whatever. And if I sat longer I bet I could come up with a bunch of other tricky bits you'd have to handle. So it's clear that if you wanted to do an "in process" debugger in LLDB you would have to do it as a separate type of process plugin. That said, I can't think of any reasons off the top of my head why this wouldn't work...

Another problem is that, at present, lldb assumes that when the program is being examined in the debugger, it is stopped altogether. You obviously can't do that if you're in process. We plan to implement a "no stop" mode where you would stop the thread that hit a breakpoint, say, but not the other threads. But that's a ways off yet. You could probably fudge this by having your "in process" process plugin know that stopping & restarting actually means suspending & resuming some subset of the program's threads, however.

In sum, I don't think that "in process" debugging as you are suggesting is impossible, but it would not fall out directly from the low level bits that do "other process" debugging. It would require some separate effort.

OTOH, I'm not quite clear why having the debugger in-process is essential to your stated goals? What do you gain by having the process control parts of the development environment be in the same process as the newly injected code? The article you quoted is about injecting debug information along with newly injected code. Having the debugger in process is not going to remove that requirement, and conversely once it is there, the debugger should be able to find it whether in process or not.

Jim

Jim, thank you–that makes alot of sense. I hadn’t thought through the signal implications. And re-reading Reid’s post, he does make it clear that the JIT-code injection is somehow a part of an interprocess communication.

The question then becomes, does the DNB.h protocol support the JIT-code injection, or if not, could that be a part of it?

1 - Modifying DNB.h to make sure it does all we need it to for all platforms
2 - Grab the code from within ProcessLinux and put it in a new DNB.cpp implementation for linux
3 - Make the ProcessLinux code us the DNB.h interface
4 - Reuse DNB.cpp/.h for linux in a new or modified version of “debugserver”

Greg’s outline sounds reasonable to me (added thoughts?), but I’d like to know if it advances the JIT-code injection objective.

Thanks guys!

Jason

Yup, it works similar to the way debuggers find out about dynamically
loaded libraries. There's a particular loader stub that gets called
after every library load or unload. Debuggers put a breakpoint on it
to stop the inferior process and re-read the list of loaded libraries
with remote memory examination routines, so there's your (hacky) IPC.

Reid

Thank you Reid, that makes alot of sense now. Is there an API to the JIT functionality? If so,
could you point out where in the code base it lives, and if there are examples of clients (how to
use it?)

Thanks!
Jason

I'm not sure what side of the interface you need an API for. You can
find the LLVM source in
llvm/lib/ExecutionEngine/JIT/JITDebugRegisterer.(cpp|h). If your use
case is to do your own codegen and then throw some debug info over to
the debugger, you'll have to be able to generate it (ie DWARF) and
wrap it in the native object file for your platform in memory. LLVM
has some code to produce ELF, so I used that.* At the time, I wasn't
worried about other platforms.

* Actually, that code for ELF generation is old and this interface is
the last user of it so far as I know.

Reid

If your use
case is to do your own codegen and then throw some debug info over to
the debugger, you’ll have to be able to generate it (ie DWARF) and
wrap it in the native object file for your platform in memory. LLVM
has some code to produce ELF, so I used that.* At the time, I wasn’t
worried about other platforms.

  • Actually, that code for ELF generation is old and this interface is
    the last user of it so far as I know.

Great! Yes, that is exactly my use case: Suppose I am stopped in the middle of code, in LLDB,
with a variable that has a debugger symbol for a collection that needs to be sorted differently than it is at the moment, in order
to determine if the code is producing the correct collection.

So I’d like to be able to define a new sorting function from the command line (like the Clang Expression stuff I think let’s one do), but
in this case it’s in a non-C++ language that I have a compiler for. Then JIT-compile that new “how to sort” function, and have it talk to already
defined data and code, and have it run from the within the debugger. After the sort, be able (from LLDB) be able to re-inspect the
output of the collection.

Does this make sense? Are there obstacles that I’m not envisioning? Any critique or commentary welcome.

Since I’m on Linux, ELF works perfectly, and I already have ELF codegen that produces Ahead-of-time compiled ELF’s now, I just need to figure out how to convert that to use the JIT engine instead, I presume. I assume I would need to produce an in-memory .so dynamically loaded library format file then. Does that sound right?

Lastly, could the JITDebugRegisterer.(cpp|h) functionality be made a part of the DNB.h interface?

Thanks,
Jason

To make this be an "all lldb" solution, some parts of the way that lldb does expression evaluation would have to be enhanced. Right now, for instance, we compile, JIT and evaluate expressions, and then clean up after ourselves completely. We don't insert code and, for instance, put it in a nice bundle with symbols et al and tell the linker about it so anybody else could reuse it. We just inject a raw blob of code, set the pc somewhere and go. So even if we didn't clean up the injected code once the execution is complete, there isn't any way to reuse the functionality you've compiled up. Actually this is a bit of a lie, there are a couple of fairly special purpose mechanisms to do this (ClangFunction, and ClangUtilityFunction) but you have to be inside lldb to use them...

We also don't generate debug information, 'cause in this context it wouldn't make sense to. And there isn't a way to say "evaluate this expression in language X" if X is not C/C++/ObjC…

We certainly want to come up with a clean way to reuse functions defined in "expressions" from lldb, though that's not at the front of our queue. Making them debuggable would also be important: once you allow this sort of thing it's bound to grow in complexity to the point where you'll need to be able to debug it. And we would be happy to support other languages, though at present we're tied to Clang as our expression front end…

Another way to do this would be to build a little code injection rig and insert it into the debuggee - for instance by having a dylib that does this job and calling "dlopen" from the debugger to insert it. Then when you wanted to insert some new code, you would put it in a file, and call some "MyRigInsertCode" type routine in your dylib. Then that would make a little bundle, stuff it into the target, tell the loader about it (which would tell the debugger about it) and then those new functions would be available in further expressions you write from the debugger.

Jim

To make this be an “all lldb” solution, some parts of the way that lldb does expression evaluation would have to be enhanced. Right now, for instance, we compile, JIT and evaluate expressions, and then clean up after ourselves completely. We don’t insert code and, for instance, put it in a nice bundle with symbols et al and tell the linker about it so anybody else could reuse it. We just inject a raw blob of code, set the pc somewhere and go. So even if we didn’t clean up the injected code once the execution is complete, there isn’t any way to reuse the functionality you’ve compiled up. Actually this is a bit of a lie, there are a couple of fairly special purpose mechanisms to do this (ClangFunction, and ClangUtilityFunction) but you have to be inside lldb to use them…

Sorry, I’m confused here; please clarify if you can. In chapters 3 and 4 of the Kaleidoscope LLVM tutorial, we define and JIT-compile functions on the fly that are then reusable many, many times, in subsequently defined functions. I am certain supposing that I have to provide a parser, generate ASTs, and call the LLVM functions like Kaleidoscope does in chapter 4, but how the code in that case preserved? Is this a different mechanism or the same mechanism that the Expression classes use, as described in http://lldb.llvm.org/architecture.html :

Once expressions have be compiled into an AST, we can then traverse this AST and either generate a DWARF expression that contains simple opcodes that can be quickly re-evaluated each time an expression needs to be evaluated, or JIT’ed up into code that can be run on the process being debugged.

We also don’t generate debug information, 'cause in this context it wouldn’t make sense to. And there isn’t a way to say “evaluate this expression in language X” if X is not C/C++/ObjC…

I agree 100% that debug info would be totally necessary. In fact it would probably become necessary to extend dwarf to preserve the full text of the source, so that the source line numbers can be referred to. (or is there some already preexisting mechanism to do this?)

The AOT compiler that I want to use is for the D programming language (http://www.digitalmars.com/d/), and already uses LLVM to generate ELF code and DWARF information on Linux (https://bitbucket.org/prokhin_alexey/ldc2/wiki/Home is the home of the compiler, which is not bug free but very complete).

We certainly want to come up with a clean way to reuse functions defined in “expressions” from lldb, though that’s not at the front of our queue. Making them debuggable would also be important: once you allow this sort of thing it’s bound to grow in complexity to the point where you’ll need to be able to debug it. And we would be happy to support other languages, though at present we’re tied to Clang as our expression front end…

Right. I would have to provide the expression parsing and so forth, but that is already completed. Perhaps a minimal hook from LLDB into (what’s the right interface? I haven’t read through the Expression code yet) that tells the inferior process to CompileString(“sort(preexistingCollection, function(a,b) { a< b}”); effectively doing the compilation. Or the JIT could be done by customizing the LLDB side. It just isn’t clear to me on what side of the debugger - debuggee interface the JIT needs to run on most effectively. Or does it matter?

Another way to do this would be to build a little code injection rig and insert it into the debuggee - for instance by having a dylib that does this job and calling “dlopen” from the debugger to insert it. Then when you wanted to insert some new code, you would put it in a file, and call some “MyRigInsertCode” type routine in your dylib. Then that would make a little bundle, stuff it into the target, tell the loader about it (which would tell the debugger about it) and then those new functions would be available in further expressions you write from the debugger.

Yes, that’s it! I think we’re thinking along exactly the same lines. Since ideally I want to include llvm based JIT-compiler functionality in some sort of library for D, (in order to be able to compile at runtime), this could be the right path. Get two birds with one stone.

Comments welcome.

Thanks guys.

Jason

Sorry if I was unclear. We also JIT compile and insert functions that are theoretically reusable many many times, the trick is how to get to back to them again - especially from the parser. Right now functions that get defined in the course of an expression don't get put back into the function lists we use to provide name lookup for the parser, and we don't tell the linker about them so the debugger doesn't find out about them through the normal course of things. So you could make a function and use it in one pass though the Expression classes, but if you wanted to reuse the results of that Expression in another, we'd have to do more work to have the parser find it second time round.

Also, as a practical matter, the Command-Line interfaces to the expression parser are all one-shot deals, the code is JIT'ted and inserted, run and then removed. Since the most common use of the expressions is stuff like:

(lldb) expr printSomething (pointer_to_print)

or whatever, you don't want to keep the code for that sitting around in the target… So there would have to be a flag to the command saying "this expression makes stuff I want you to preserve."

That's all I meant.

Jim