Crash in libunwind


We have been investigating a crash in our application that may be related to how stack frames are generated by the JIT. We observe it with LLVM 2.9, but not with LLVM 2.8, everything else being the same. The crash occurs when dynamically generated code calls code that tries to unwind the stack.

Here is what the stack trace looks like on MacOSX 10.6 :

0 libSystem.B.dylib 0x00007fff87297bdb libunwind::CFI_Parser<libunwind::LocalAddressSpace>::parseCIE(libunwind::LocalAddressSpace&, unsigned long long, libunwind::CFI_Parser<libunwind::LocalAddressSpace>::CIE_Info*) + 75
1 libSystem.B.dylib 0x00007fff87298795 libunwind::CFI_Parser<libunwind::LocalAddressSpace>::decodeFDE(libunwind::LocalAddressSpace&, unsigned long long, libunwind::CFI_Parser<libunwind::LocalAddressSpace>::FDE_Info*, libunwind::CFI_Parser<libunwind::LocalAddressSpace>::CIE_Info*) + 149
2 libSystem.B.dylib 0x00007fff8719d928 libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_x86_64>::setInfoBasedOnIPRegister(bool) + 312
3 libSystem.B.dylib 0x00007fff8719e348 libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_x86_64>::step() + 216
4 libobjc.A.dylib 0x00007fff8852c4b6 objc_addExceptionHandler + 828
5 0x00007fff82399722 _CFDoExceptionOperation + 402
6 0x00007fff887e9989 _NSAppKitLock + 79

The parts above that vary from crash to crash, but we usually have some window-management stuff and some of our code, with some dynamically-generated code higher in the stack. Our application can run reliably for hours, and runs correctly in Valgrind AFAWK, as long as we avoid the cases where dynamically generated code invokes window management functions.

We have tried with both NoFramePointerElim set and cleared, it doesn't seem to make a difference.

Has anybody else run into a similar stack trace? I have not found much through Google, but there's a MacRuby ticket with a very similar stack trace : This was closed as "worksforme", but the original poster indicated it was still failing for him, although only on one particular machine. This suggests the problem may depend on the specific version of LLVM just like it does for us.

Can anybody suggest ideas on how to investigate this further? Any instrumentation in LLVM or libUnwind worth activating?


This may be bogus, but do you have:

llvm::JITExceptionHandling = true;

for the code that generates the dynamic code?

It has been a while, but I don't recall what will happen when dynamic code, generated
with jit exception handling turned off, invokes libraries which in turn try to unwind the stack
via the libunwind api. However given that you say the code works with 2.8, my concern maybe
irrelevant. Personally I have a feeling that with llvm::JITExceptionHandling turned off, the
unwinding should correctly "pass through" the generated code. With llvm::JITExceptionHandling turned
on, a potential crash could occur if the personality function associated with the generated code
is not behaving properly.

Hopefully others will respond to rectify my thoughts and therefore help you out. :slight_smile: I'm guessing
more data will be needed though.



Thanks. Yes, we have JITExceptionHandling set. Specificaly :

    JITExceptionHandling = true;
    JITEmitDebugInfo = true;
    UnwindTablesMandatory = true;
    NoFramePointerElim = true;

Also, I was a bit hasty in saying that it works in 2.8. Apparently, it is happening much less frequently.

Interestingly, the problem seems to go away with JITExceptionHandling = false. We need to confirm that the problem is eliminated and not just displaced.