Stack unwinding with hand-crafted CFI

(this is a followup to <Login)

Hello Jason,

I am trying to get stack unwinding to work from within deep bowels of
glibc. Glibc contains a lot of functions (and "functions"), which do
crazy stuff with the stack, frame and instruction pointers. Luckily,
most (all?) of these functions contain CFI which generally does "the
right thing". Unfortunately, these UnwindPlans are sometimes not
picked by LLDB because they are too strange.

For example take the "function" _L_lock_4746, which has the following
dissassembly:
$ disassemble -n _L_lock_4746
libc.so.6`_L_lock_4746:
    0x7ffff70519bc <+0>: leaq (%rdx), %rdi
    0x7ffff70519bf <+3>: subq $0x80, %rsp
    0x7ffff70519c6 <+10>: callq 0x7ffff70df120 ;
__lll_lock_wait_private
    0x7ffff70519cb <+15>: addq $0x80, %rsp
    0x7ffff70519d2 <+22>: jmp 0x7ffff70515fa ;
_IO_new_file_underflow + 138 at fileops.c:592

And the following unwind plans:
$ image show-unwind -n _L_lock_4746
UNWIND PLANS for libc.so.6`_L_lock_4746 (start addr 0x7ffff70519bc)

Asynchronous (not restricted to call-sites) UnwindPlan is 'assembly
insn profiling'
Synchronous (restricted to call-sites) UnwindPlan is 'eh_frame CFI'

Assembly language inspection UnwindPlan:
This UnwindPlan originally sourced from assembly insn profiling
This UnwindPlan is sourced from the compiler: no.
This UnwindPlan is valid at all instruction locations: yes.
Address range of this UnwindPlan: [libc.so.6..text + 374044-0x000000000005b537)
row[0]: 0: CFA=rsp +8 => rsp=CFA+0 rip=[CFA-8]
row[1]: 10: CFA=rsp+136 => rsp=CFA+0 rip=[CFA-8]
row[2]: 22: CFA=rsp +8 => rsp=CFA+0 rip=[CFA-8]

eh_frame UnwindPlan:
This UnwindPlan originally sourced from eh_frame CFI
This UnwindPlan is sourced from the compiler: yes.
This UnwindPlan is valid at all instruction locations: no.
Address range of this UnwindPlan: [libc.so.6..text + 374044-0x000000000005b537)
row[0]: 0: CFA=rsp-128 => rip=dwarf-expr
row[1]: 3: CFA=rsp-128 => rip=dwarf-expr
row[2]: 10: CFA=rsp +0 => rip=dwarf-expr
row[3]: 14: CFA=rsp+128 => rip=dwarf-expr
row[4]: 22: CFA=rsp-128 => rip=dwarf-expr

Arch default UnwindPlan:
This UnwindPlan originally sourced from x86_64 default unwind plan
This UnwindPlan is sourced from the compiler: no.
This UnwindPlan is valid at all instruction locations: no.
row[0]: 0: CFA=rbp+16 => rbp=[CFA-16] rsp=CFA+0 rip=[CFA-8]

Arch default at entry point UnwindPlan:
This UnwindPlan originally sourced from x86_64 at-func-entry default
This UnwindPlan is sourced from the compiler: no.
This UnwindPlan is valid at all instruction locations: not specified.
row[0]: 0: CFA=rsp +8 => rsp=CFA+0 rip=[CFA-8]

The "eh_frame CFI" plan is hand crafted to be valid at all locations
within the function and if I force its use it actually produces the
correct backtrace. However, the problem is that it is not selected by
default. This happens because when we try to augment it, we fail
because UnwindAssembly_x86::AugmentUnwindPlanFromCallSite expects the
CFI plan to be in a very specific form. Then we end up falling back to
"assembly insn profiling", which is completely bogus in this case.

So, I would like to teach LLDB to use the provided CFI for unwinding
in strange functions like these. However, I am unsure what is the
cleanest solution. I would probably need to short-circuit the
augementing logic to avoid trying to augment plans like these and just
use the CFI plan as-is. Do yo have suggestions on how to do this?
Perhaps set plan-valid-at-all-locations to true if we detect the
author has gone through extra trouble to produce the CFI (use "rip set
by dwarf-expr", or some more complex contidion as an indicator)? Then
we could use the presence of this flag as an indicator that we can
avoid augmentation?

What do you think?

cheers,
pavel

The main problem is 99% of all the EH frame info is valid only at call sites. Because of this we don't use EH frame in the first frame and we don't use it after async interrupt functions like sigtramp. We have no way of marking EH frame as being valid for every PC in a function. If we try to use an augmentation letter in the CIE, then all consumers (unwind libraries for the operating system, debuggers, etc) of the augmentation letters need to know about the new augmentation letter and if they don't they get really unhappy.

We really do need to mark the CIEs or FDEs individually so we know which ones we can trust and which we don't. We really can't just say "trust all of glibc", we really need each one marked somehow, but we don't have a way to do this.

I don't have any good solutions.

Greg

Forgot to add the list.

Forgot to add the list.

Hello,

The main problem is 99% of all the EH frame info is valid only at call sites. Because of this we don't use EH frame in the first frame and we don't use it after async interrupt functions like sigtramp.

Small clarification on Greg's summary here. Since last summer, we've started using eh_frame for the currently-executing frame. Both gcc and clang were generating eh_frame which described the function prologues. And modern gcc's emit eh_frame which describes the epilogue. So Tong Shen, an intern at Google, added code to add epilogue instructions if it looked like the eh_frame was missing them.

There's no guarantee that eh_frame unwind instructions will describe the prologue/epilogue, or that it will describe any important changes in the middle of the function except at places where the func can throw, or a callee can throw, but in practice the prologue at least is present. So we decided to try living off eh_frame for the currently executing frame, and it seems to be working OK.

What I suggest is to change the logic to something like this:
eh := GetEHFrameUnwindPlan();
if (I_know_how_to_augment(eh))
return Augment(eh)
else if (Plan_is_very_complicated(eh))
return eh; // if the EH section contains a complicated plan, then
our attempts to do instruction profiling will likely fail. Let's just
stick to the EH plan instead.

This sounds like a good idea. The augmentation detection code in UnwindAssembly-x86.cpp is kind of hacky (and it's done at two different layers in different ways) -- my main concern here was modifying a correct set of unwind instructions from eh_frame by adding incorrect garbage to it. So it goes out of its way to limit the augmentation to functions that look "straightforward".

I had a similar problem with the objective c runtime on macosx. It has several hand-written assembly functions that the assembly instruction profiler cannot handle correctly. The maintainer of that solib added hand-written -- I added the DynamicLoader::AlwaysRelyOnEHUnwindInfo method for this case. The unwinder will call the DynamicLoader and say "should I trust the eh_frame for this solib?" and the MacOSX DynamicLoader plugin will say yes for libobjc functions. (this was back when we never used eh_frame instructions for the currently-executing frame)

That would be another way to handle this -- to whitelist the libc solib on Linux. But I think your idea of flagging eh_frame which looks hand-written/complicated is a good one too.

J

Hello Jason,

thank you for the clarification. I've been currently preempted by
other stuff, but when I get back to this, I will try to implement one
of the two solutions you suggest here.

cheers,
pl