Hiding trampoline functions from the backtrace, is it possible ?

When I am using bt to look at my backtrace, I get for a method call breakpoint in +[Hello printName:version:] a stacktrace like this (with my custom Objective-C runtime):

* frame #0: 0x00000000004179b3 test-debugger`+[Hello printName:version:](self=Hello, _cmd=<no value available>, _param=0x00007fffffffd8a8, name=<unavailable>, version=<unavailable>) at main.m:21:21
frame #1: 0x00000000004bb659 test-debugger`_mulle_objc_object_call_class_nofail(obj=0x000000000066a200, methodid=3009363030, parameter=0x00007fffffffd8a8, cls=0x000000000066a3e0) at mulle-objc-call.c:668:13
frame #2: 0x00000000004bbe60 test-debugger`_mulle_objc_object_call_class(obj=0x000000000066a200, methodid=3009363030, parameter=0x00007fffffffd8a8, cls=0x000000000066a3e0) at mulle-objc-call.c:939:18
frame #3: 0x00000000004bcb63 test-debugger`_mulle_objc_object_call_class_needcache(obj=0x000000000066a200, methodid=3009363030, parameter=0x00007fffffffd8a8, cls=0x000000000066a3e0) at mulle-objc-call.c:1320:13
frame #4: 0x00000000004bcf61 test-debugger`mulle_objc_object_call(obj=0x000000000066a200, methodid=3009363030, parameter=0x00007fffffffd8a8) at mulle-objc-call.c:1379:13
frame #5: 0x0000000000417a28 test-debugger`main(argc=1, argv=0x00007fffffffd9c8) at main.m:29:4

I have my Plugin/LanguageRuntime/ObjC/MulleObjC added to lldb and it is working fine for stepping through from “main” to “-[Hello printName:version:]” directly. Now I wonder, if there are provisions in lldb to extend this idea of trampoline hiding to stacktraces (preferably as an option), so the stacktrace would look like this:

* frame #0: 0x00000000004179b3 test-debugger`+[Hello printName:version:](self=Hello, _cmd=<no value available>, _param=0x00007fffffffd8a8, name=<unavailable>, version=<unavailable>) at main.m:21:21
frame #5: 0x0000000000417a28 test-debugger`main(argc=1, argv=0x00007fffffffd9c8) at main.m:29:4

Ciao
Nat!

I think the best mechanism for this would be to ensure that the trampolines are marked up as DW_AT_artificial and/or DW_AT_trampoline by the compiler. I'm pretty sure LLDB then already knows how to hide artificial frames (somebody else can probably provide pointers for how that works).

-- adrian

At some point it would be good to add trampoline support at the Python level. You can produce scripted thread plans now - and the trampoline mechanism just returns a thread plan to step through the trampoline... It would be neat to be able to support other systems without having to build them into lldb. But that's off-topic. Glad you got that part working...

lldb can step through trampolines (and step back out again in some cases). But there isn't any support for suppressing the printing of frames.

I don't think it is a good idea for lldb to lie to the user and pretend that frames that do exist don't exist. But I think it's fine to have a mode where lldb suppresses printing some frames to reduce noise. As you showed in your example, the frame numbering would still indicate the presence of the frames, and presumably there would be a "bt --full" or something to show them all. But there isn't support for this at present.

One way to add this is to use the "Frame Recognizers" Kuba added to lldb recently. The use he had for them was to produce artificial variables for frames you don't have debug information for. But one of the other jobs I had envisioned for frame recognizers was to mark frames as uninteresting for printing. Then you could hook up "bt" to suppress frames that the recognizer marked this way. Since you can add recognizers in Python, this is a fairly attractive way to go, since people could adjust their printing to suppress frames not interesting to them. And of course, as Adrian suggested, recognizers could consult the debug info as well to suppress DW_AT_artificial and DW_AT_trampoline.

Xcode has a neat implementation of this stack compaction idea, where it keeps the first and last call into a library with out debug information, and suppresses the ones in between. The first call is going to be the public API that your code called, so seeing it is helpful. But then the ones in between are generally internal implementation, and so not as interesting to users of the library. For that you'd have to have a "stack pattern" recognizer, not a frame by frame recognizer. So it wouldn't fit naturally into Kuba's work.

It also isn't what you need, since you want to suppress all the recognized frames. So for your purposes, adding "should suppress" to the recognizers and using that info in backtraces should suffice.

Jim

I think the best mechanism for this would be to ensure that the trampolines are marked up as DW_AT_artificial and/or DW_AT_trampoline by the compiler. I'm pretty sure LLDB then already knows how to hide artificial frames (somebody else can probably provide pointers for how that works).

-- adrian

That's a good idea. I can just put __attribute__((artificial)__ on my dispatch functions. That's the low-hanging-fruit code I like :slight_smile:
And though it might not fully work yet with lldb , it may in the future.

Unfortunately clang complains that the "'artificial' attribute only applies to inline functions" (why ?). Bummer.

Ciao
    Nat!

We've had many requests to elide some classes of entries in backtraces - like to mirror the Xcode display I mentioned previously. Most of these requests don't depend on the functions being marked artificial. So if we're going to do this, something more general than just "marked artificial" -> elided anyway.

Jim

Yes.

Having done a little further research... Artificial won't work for general cases anyway, since it's restricted to inline code (for some reason) on gcc and clang. I wonder why, since for a function the only real effect is to emit DW_AT_artificial (AFAIK). The restriction seems arbitrary and DWARF wouldn't mind.. But the compilers do, so it seems out anyway.

DW_AT_trampoline isn't supported by llvm. As I read the description of DW_AT_trampoline, its more like a hardcoded vector (a->b), so not useful for cases like objc_msgSend, where you don't know the destination a priori.

If I look at the DWARF spec, I don't see any other way to mark a function as "boring". I still think this would be a good thing, as this would be useful for other debuggers as well, which could instantly work. Also a lot of code in the lldb Trampoline classes, for step-in and step-out could probably just be removed.

Ciao
Nat!

We've had many requests to elide some classes of entries in backtraces - like to mirror the Xcode display I mentioned previously. Most of these requests don't depend on the functions being marked artificial. So if we're going to do this, something more general than just "marked artificial" -> elided anyway.

Jim

Yes.

Having done a little further research... Artificial won't work for general cases anyway, since it's restricted to inline code (for some reason) on gcc and clang. I wonder why, since for a function the only real effect is to emit DW_AT_artificial (AFAIK). The restriction seems arbitrary and DWARF wouldn't mind.. But the compilers do, so it seems out anyway.

DW_AT_trampoline isn't supported by llvm. As I read the description of DW_AT_trampoline, its more like a hardcoded vector (a->b), so not useful for cases like objc_msgSend, where you don't know the destination a priori.

If I look at the DWARF spec, I don't see any other way to mark a function as "boring". I still think this would be a good thing, as this would be useful for other debuggers as well, which could instantly work. Also a lot of code in the lldb Trampoline classes, for step-in and step-out could probably just be removed.

I don't think that is right for "step-in". As you said above, in the classic example of a trampoline: objc_msgSend you can't statically know the destination. So the DWARF can't help resolve this; you would still need to do the work the lldb trampoline classes do at runtime.

step-out past trampolines could just "keep stepping past boring functions". There's no need to support this for ObjC - at least the Apple & NeXT versions - since the dispatch function is a tail call function. But we do do something like for Swift. But that part is very little code compared to figuring how to step in correctly.

Jim

From: lldb-dev [mailto:lldb-dev-bounces@lists.llvm.org] On Behalf Of Jim
Ingham via lldb-dev
Sent: Tuesday, September 24, 2019 4:19 PM
To: Nat!
Cc: LLDB
Subject: Re: [lldb-dev] Hiding trampoline functions from the backtrace, is
it possible ?

>
>
>
>> We've had many requests to elide some classes of entries in backtraces
- like to mirror the Xcode display I mentioned previously. Most of these
requests don't depend on the functions being marked artificial. So if
we're going to do this, something more general than just "marked
artificial" -> elided anyway.
>>
>> Jim
>>
>>
>
> Yes.
>
> Having done a little further research... Artificial won't work for
general cases anyway, since it's restricted to inline code (for some
reason) on gcc and clang. I wonder why, since for a function the only real
effect is to emit DW_AT_artificial (AFAIK). The restriction seems
arbitrary and DWARF wouldn't mind.. But the compilers do, so it seems out
anyway.

Clang puts DW_AT_artificial on implicit member functions, which tend to be
inlined due to their simplicity; explicit functions marked as 'inline' in
the source would not be flagged as artificial. There's no direct link
between inline and artifical.

I grepped for "trampoline" in Clang source; it occurs only in comments,
and never with respect to functions spontaneously created by the compiler.
If they're there, they're called something else.
--paulr

In a pure world, debugger users would use "step-in" when they meant to step in, and step-over when they don't. However, it's actually pretty natural to say "step-in" and then just keep hitting return, and expect the debugger to trace through your code "doing the right thing".

When you've stopped at this line:

   printf("This really is a function you know.\n");

almost nobody wants to step into the code for printf. They want the debugger to turn their "step-in" into a "step-over". So you have to have some criteria for when users probably do mean "step-in". In lldb & gdb that starts with a simple "does it have debug info" test. If not, step-in -> step-over.

We've also added other criteria - for instance by default we don't step into anything from std:: since there lots of inlined code from std:: that has debug info but people still don't want to step into it.

But that means if there are any patterns on the system that pass through code without debug information but then end up in code that might have debug information, the debugger needs to intervene so that step-in continues to be step-in in those cases. Those patterns are what lldb calls Trampolines. For instance, cross-library call shims are trampolines, since you generally first step into the shim which is a symbol without debug information, then go through some loader code to resolve the symbol and then tail-call to the symbol. lldb is expected to automatically follow through the loader code into the real symbol. Similarly, objc_msgSend is a trampoline in this sense. And also, because we don't want to stop in general std:: code, the function calling layers that std::function goes through to invoke the function it holds also constitute a trampoline. Shafik added a trampoline handler for these not too long ago.

It isn't strictly a compiler notion, and I certainly didn't adopt the term from clang. I'm pretty sure that's what gdb calls it, or maybe I just made it up, I can't really remember.

Just marking a symbol somehow doesn't help lldb, in this regard, since it doesn't tell us how to get to the "real" target of the trampoline.

But more pertinent to this conversation, I don't think the project of eliding some stack frames to make backtraces easier to read should be linked to the notion of the symbol being a trampoline, since it has much wider usage. If there were a system that marked frames for elision, it would be fine if "has DW_AT_artificial" was one of the ways to decide a frame should be elided. But that should just be one pattern we look for, not the way we implement the feature.

Jim

I don't think that is right for "step-in". As you said above, in the classic example of a trampoline: objc_msgSend you can't statically know the destination. So the DWARF can't help resolve this; you would still need to do the work the lldb trampoline classes do at runtime.

I was thinking along the line of the debugger looking examining the stack frame after a step and if it is marked as artificial continuing to do "stepi" until it hits a frame that isn't marked artificial. That would work for quite a bit of code (probably most of mine :). But I can see that the scheme would fail, if the trampoline code needs to execute a stdlib function or some such (maybe on a cache miss).

Ciao
Nat!

Right. If you watch the lazy binding that happens the first time a symbol is called - at least on macOS - the loader does a lot of work, and calls various standard library functions and helper functions in its own library that aren't in any real sense trampolines. Actually, objc_msgSend can do the same when first binding. And of course this wouldn't work at all for things like the std::function trampoline. Plus, remember that for the most part these "trampolines" like the loader symbol ones or the objc_msgSend ones all happen in code that most users don't have debug information for. So relying on debug information to handle this really isn't workable.

Also, single-stepping is not fast, particularly when debugging a remote device. It's really the round-trip time that costs, but with single-stepping you get lots of that. We got really significant speed-ups in stepping by having lldb run from branch to branch using breakpoints rather than single-stepping through the line. So a method which relies on single stepping - or even running from branch to branch - through potentially lots of code - is not a good strategy.

Jim