post-mortem debugging with lldb

Hello,

I am trying to get KDE's crash reported ("DrKonqi") to work with lldb. This is part of KDE's runtime, and is designed to catch runtime exceptions (crashes). When they occur, a debugger is launched, instructed to connect to the failing process, and then to provide a backtrace.
Much like Apple's crash reporter, but more elaborate.

On systems with the GNU toolchain, the debugger used is gdb launced as

gdb -nw -n -batch -x %tempfile -p %pid %execpath

Where %pid is the crashed process ID, %execpath the full path to the binary, and %tempfile a batch file containing the commands

set width 200
thread
thread apply all bt

I currently have the following for using lldb:

lldb -s %tempfile -p %pid < /dev/null
%tempfile:
set set term-width 200
thread info
bt all
quit

The redirection from /dev/null is necessary because otherwise lldb will not respect the quit command (when read from a batch file; that must be a bug?)

In itself this works, except that I do not always get useful information. A full discussion can be found at https://git.reviewboard.kde.org/r/121286/ , and it shows an example where one can only "see back to when" the crash handler was invoked.
Sadly I cannot compare because I haven't (yet) been able to get gdb (from MacPorts) to be permitted to function on my 10.9 system, but the same system booted into 10.6.8 will work just fine with Apple's gdb.

Any useful suggestions on how best to do post-mortem debugging with lldb will be greatly appreciated!

René

I currently have the following for using lldb:

lldb -s %tempfile -p %pid < /dev/null
%tempfile:
set set term-width 200
thread info
bt all
quit

The redirection from /dev/null is necessary because otherwise lldb will not respect the quit command (when read from a batch file; that must be a bug?)

It's worth noting that Jim added a real batch mode to lldb a month ago:

r219654 | jingham | 2014-10-13 18:20:07 -0700 (Mon, 13 Oct 2014) | 12 lines

This adds a "batch mode" to lldb kinda like the gdb batch mode. It will quit the debugger
after all the commands have been executed except if one of the commands was an execution control
command that stopped because of a signal or exception.

this is in the svn repository but hasn't made it in to an Apple release of lldb yet. There have been other fixes made related to batch mode as well since then -- I would not be surprised if your code works correctly with the current TOT without redirecting from /dev/null.

In itself this works, except that I do not always get useful information. A full discussion can be found at https://git.reviewboard.kde.org/r/121286/ , and it shows an example where one can only "see back to when" the crash handler was invoked.

The short backtrace in that discussion is a tricky one -- _sigtramp followed by objc_msgSend. Both of these can be difficult for the unwinder to backtrace out of (_sigtramp because the register context is saved out-of-band by the kernel and we rely on accurate eh_frame instructions to find it) and objc_msgSend because it is hand-written assembly with some hand-written eh_frame instructions that are accurate at many -- but not all -- points in the function.

I have seen some edge cases on Mavericks (Mac OS X 10.10) where _sigtramp unwinding is not completely accurate. It's on my todo list to figure out what's going on there.

If your process was at a location in objc_msgSend that did not have accurate eh_frame unwind descriptions, that would also account for this.

I think it will be difficult to hit this backtrace again, it is likely to be rare. Your process of attaching and collecting information looks reasonable to me.

I gather you're scraping the output of lldb for information about the crash. This can be a problem as the debugger output changes over time ... if I were writing a tool like this, I would probably write it in Python using the SB API that lldb supports. lldb is actually a debugger *library* and the lldb command line program is one client of that library (Xcode is another). You can write a Python script (or C++ program) that uses the library to attach to the process, iterate over the threads, print the backtrace information you want, etc.

It's probably more work than you want to do right now but for long-term maintainability, it would be the way to go.

J

What does TOT stand for? Top Of theTops ? :slight_smile:

Sorry, head, top of tree, tip of tree. The trunk branch in svn.

I have seen some edge cases on Mavericks (Mac OS X 10.10) where _sigtramp unwinding is not completely accurate. It's on my todo list to figure out what's going on there.

Mavericks (10.9) or Yosemite (10.10)?

Heh. I meant Yosemite 10.10. I noticed some odd behavior when looking at a backtrace with _sigtramp the other day on a 10.10 system. Basic backtracing looked correct but I thought one of the register save locations looked wrong.

It might be feasible though to replace lldb with a dedicated and very simple "klldb" frontend. Is there a tutorial/example somewhere online that could serve as a starting point?

examples/python/process_events.py is the most common example we show to people. There's a C++ example of starting up a debug session, adding a breakpoint, hitting the breakpoint and doing a backtrace in test/api/multiple-debuggers/multi-process-driver.cpp.

When I looked at this yesterday I didn't see an obvious way to attach to a process without having a Target -- and couldn't create a Target without an executable file. I need to go look at the API again but it wasn't super clear how to get started on your workflow where you're attaching to a PID.

A bit off-topic, but not irrelevant: if the lldb app is in fact just a frontend, is there no one who has had the idea to create a sort of llvm-gdb that couples the gdb user-interface to the lldb library? That would probably also help with the rather severe lack of (free) GUIs to lldb on platforms other than recent OS X versions.

Most IDEs/GUIs talk to gdb with the "gdb MI" interface. It's a simple key-value/array/dictionary markup language that you can use when driving the debugger - kind of like JSON, but it was created before JSON so it's different. There is an lldb-mi front end being developed to support these IDE/GUIs although I don't know the state of that driver program.

Emulating the full gdb command line input/output would be a lot of work to make it behave 100% the same. And some things would be very difficult to make the same -- for instance, gdb has a hand-written C/C++ parser to do expression evaluation (p 5+3). lldb uses clang to do this - which means we behave more faithfully to the language spec but some things that aren't legal C/C++ are very useful to do in a debugger. e.g. "p main + 5".

I think adding an MI interface for lldb is a good idea -- but trying to emulate gdb's command line behavior would be a huge time sink for little benefit.

(For an IDE, even better than using the MI interface would be to use lldb's native SB APIs to drive it. It's a much cleaner and richer interface between the debugger library and the front end. Hopefully some day we'll see the big IDEs like Eclipse use lldb by the SB API but until then, the MI is the easiest way to get them to work together)

J

One aside -- parsing text from the debugger is never stable, not with gdb, not with lldb. A long time ago IDEs used to use the debugger and try to parse the output ("annotated mode" in gdb would be an early example of trying to make it more machine parseable for emacs). MI ("machine interface") was designed exactly for this reason -- so an IDE would have a stable protocol to use when communicating with the debugger.

Neither gdb nor lldb changes its output frequently -- but it does change over time. Any debugger scripting based on regexp-matching the output from these debuggers will break on occasion and will require maintenance.

J

Just to quickly butt in;

The lldb-mi driver can be found in tools/lldb-mi and at works with eclipse debugging remote hosts. There are a few contributing to it now, so hopefully more features are incoming.

Colin

Note that the part of lldb that needs to be apple code-signed to be convenient for use is only the "debugserver" tool which lives in LLDB.framework/Contents/Resources. So if you really need to distribute something with lldb libraries you have built yourself, you can just grab the debugserver from an official build and use that. lldb is expected to be able to work with a variety of debug servers, so that shouldn't cause any problems (short of bugs of course...)

Jim