Here's some data I gathered today. What I did was I disabled apple
accelerator table parsing in ObjectFileELF (patch is in the
attachment) and then I compared startup time when debugging lldb with
itself. Specifically I used an Release (optimized) lldb to open a
Debug lldb, set a breakpoint, hit it, display local variables and
backtrace. The precise command is:
$ time opt/bin/lldb -o "br set -n DoExecute" -o "pr la" -o "bt" -o "fr
var" -b -- dbg/bin/lldb -- /bin/ls
I ran the command three times and chose the median result.
Before I show the measurements I have to give one big disclaimer: The
debugger with accelerator tables disabled does not appear to be fully
functional. Specifically most (all?) of the objc tests in the test
suite hang, and also 50 additional tests fail (the failures seem to be
related to expression evaluation, mostly). However, I think the fact
that the remaining ~2000 tests passed means that we still have been
reading the dwarf parsing functionality is reasonably intact.
Now, the numbers:
apple.diff (1.84 KB)
We could save this information to disk. I would suggest our Apple Accelerator table format! hahaha. Though in LLDB, we keep the modules loaded for the life of the debugger if they don’t change, so this would only speed up the debug, quit IDE, and re-debug the same executable scenario.
We might need to introduce the notion of “compile, edit, debug” flows where we concatenate the accelerator tables, and “archive” builds that will be archived on build servers where we do more complete info and merge the tables.
So results might be wrong because you can't debug /bin/ls and many commands might just not be happening (as soon as you get an error, the command stop happening). /bin/ls is Apple signed and SIP (system integrity protection) will stop you from actually debugging it. Did you disable SIP?
That's not the case, the nested debugger get's stopped in
CommandObjectTargetCreate::DoExecute before it even touches the
/bin/ls file. I could have passed anything there (probably /bin/ls
wasn't the best choice though), it's just this was the easiest thing I
came up with for stopping at a place with a non-trivial backtrace and
I've verified that in each test scenario the (outer) debugger prints
out a reasonable backtrace and local variable values.
I wonder if making indexing multi-threaded has solved speed issues?
The main idea is to touch as few pages as possible when doing searches. We
effectively have this scenario right now with Apple DWARF in .o file
debugging. So much time is spent paging in each accelerator table that we
have very long delays starting up large apps. This would be more localized,
but there would be a similar issue. Concatenation would be fine for now if
we make it work, but for long term archival, the real solution is to merge
I think we should get better performance from concatenated tables than
from the .o file search on darwin -- the proximity (on disk and
in-memory) should make the os read-ahead work much better than when
the individual tables are scattered in various throughout the disk.
If we're talking about the scenario of archiving debug info on
buildservers or whereever, then something like dsymutil definitely
makes sense. The scenario I am optimizing for though is the
edit-compile-debug cycle of binaries built on your local machine.
There it doesn't matter much whether the indexing is done in the
debugger or the compiler, it just needs to be done quickly.
We could save this information to disk. I would suggest our Apple Accelerator table format! hahaha.
Yes, that would certainly make sense.