Looking for advice on how to debug a problem with C++ style exception handling code that my compiler generates.

Hi,

I’m looking for advice on how to debug a problem with my exception handling code.
I’m asking a specific question here - but general advice on how to debug something like this would be greatly appreciated.

Is there any way to get a list of landing pad clauses that are active at a particular point in a program?
I’d like to get something like a backtrace but listing all active landing pad clauses. The typeids of the C++ types
I’m trying to debug a problem where an exception that I’m throwing is not being caught.
I’m generating JITed code with LLVM and landing pads and I’ve got shared libraries - lots of things going on that could potentially be going wrong.

A list of the pointer values like @_ZTIN4core9DynamicGoE is what I’m looking for. Then I could compare that to the typeids that I know should be in that list.

“(TRY-0).landing-pad”: ; preds = %"(TRY-0).normal-dest14", %"(TRY-0).tagbody-B-1", %"(TRY-0).normal-dest10", %"(TRY-0).normal-dest9", %"(TRY-0).normal-dest8", %"(TRY-0).normal-dest", %"(TRY-0).tagbody-#:G1621-0"
%14 = landingpad { i8*, i32 } personality i32 (…)* @__gxx_personality_v0
catch i8* @_ZTIN4core9DynamicGoE
catch i8* @_ZTIN4core10ReturnFromE, !dbg !26
%15 = extractvalue { i8*, i32 } %14, 0, !dbg !26
%16 = extractvalue { i8*, i32 } %14, 1, !dbg !26
%17 = call i32 @llvm.eh.typeid.for(i8* @_ZTIN4core9DynamicGoE), !dbg !26
%18 = icmp eq i32 %16, %17, !dbg !26
br i1 %18, label %"(TRY-0).handler-block14470", label %"(TRY-0).dispatch-header19", !dbg !26

I’m getting this error when I throw a core::Unwind exception and I’m certain that there is a landing pad with that clause.

libc++abi.dylib: terminating with uncaught exception of type core::Unwind
…/…/src/gctools/memoryManagement.cc:75 Trapped SIGABRT - starting debugger
ABORT was called!!!

I’ve written a Common Lisp compiler that uses LLVM as the backend and it interoperates with C++ code and I use C++ exception handling for non-local exits.

Hi Christian,

I suspect that at least some of the details depend on what platform you’re working on. I believe that MCJIT attempts to register eh frame information for either MachO or ELF objects (though for some ELF platforms nothing actually happens). What happens to it after that is a darker area, at least for me.

Apparently there was a GDB command that did just what you want – “info catch” – but I had never used it and it has been removed. It’s too bad because it sounds like a nice feature. It was supposed to dump a list of catch handlers for whatever frame you’re looking at. I suspect, however, that it would have just confirmed that your catch handler isn’t properly hooked up without being helpful in figuring out why.

You could try debugging the RuntimeDyld code that registers eh frames and see if that looks right. RuntimeDyld::registerEHFrames() might be a helpful starting point.

-Andy

Hi Christian,

Andy’s already covered the major points, but you could consider filing a bug at http://llvm.org/bugs too, especially if you’ve got a small test-case that demonstrates the issue. Exception handling hasn’t been a priority in the past, but as more people adopt LLVM’s JIT APIs I suspect it will get more attention, and bug reports will help us figure out what needs doing.

Cheers,
Lang.

Dear Andrew,

Thank you very much. A command like “info catch” would be very handy right now.

I put a breakpoint on RuntimeDyld::registerEHFrames and on entry the registers read as shown below. It doesn’t matter if I compile the example with the old compiler (that generates code that works) or the new compiler (that generates code that terminates when the exception is thrown) “rdx” == 0x0 in both cases. So I don’t think that is the source of the problem.

I’m narrowing down on the problem by hacking my compiler to load a module from a bitcode file rather than generate a new module and systematically changing the new generated code into the old generated code.

On entry to RuntimeDyld::registerEHFrames
       rax = 0x00007f8518441c70
       rbx = 0x00007f8518441ba0
       rcx = 0x00007f8519923600
       rdx = 0x0000000000000000 <<<< argument 3 Size
       rdi = 0x00007f8518441bd8 <<<< argument 1 Addr
       rsi = 0xffffffffffffffff <<<< argument 2 LoadAddr
       rbp = 0x00007fff5625e390
       rsp = 0x00007fff5625e2a8
        r8 = 0x00000000000003ff
        r9 = 0x00007f8519921600
       r10 = 0x00000001139d8000
       r11 = 0x0000000000000000
       r12 = 0x00007fff5625e2e0
       r13 = 0x00007f8518441bd8
       r14 = 0x00007f8518441c50
       r15 = 0x00007f8518441b10
       rip = 0x000000010b1c1bb0 clasp_boehm_o`llvm::RuntimeDyld::registerEHFrames()
    rflags = 0x0000000000000257
        cs = 0x000000000000002b
        fs = 0x0000000000000000
        gs = 0x0000000017da0000

void llvm::RTDyldMemoryManager::registerEHFrames ( uint8_t * Addr,
uint64_t LoadAddr,
size_t Size
) [override, virtual]

It appears that I’ve found a very curious effect where if I JIT a function that throws an exception and I use “call” to call it the throw fails despite there being a “catch” clause on the stack. But if I use “invoke” it works fine!

If I call the function (@cc_invoke) that throws a “core::CatchThrow” exception like this:


call void @cc_invoke({ {}*, i64 }* %result-ptr, {}* %3, i64 2, {}* %4, {}* %5, {}* null, {}* null, {}* null)

;; Comment out the next two lines and   

;    to label %return0 unwind label %landing-pad1

;return0:
ret void

 

landing-pad1:                                     ; No predecessors!

  %6 = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0

          cleanup

  resume { i8*, i32 } %6

}

It fails with: libc++abi.dylib: terminating with uncaught exception of type core::CatchThrow

If instead I convert the “call” into an “invoke” and hook in a dummy landing-pad like this:


invoke void @cc_invoke({ {}*, i64 }* %result-ptr, {}* %3, i64 2, {}* %4, {}* %5, {}* null, {}* null, {}* null)

    to label %return0 unwind label %landing-pad1

return0:

  ret void

 

landing-pad1:                                     ; No predecessors!

  %6 = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0

          cleanup

  resume { i8*, i32 } %6

}

The CatchThrow exception is caught and everything works fine! I don’t need to call the caller or any outer function with “invoke” except for in the function that has the landing pad that recognizes core::CatchThrow.

I don’t see anything in the Itanium ABI that says I need to call the function that throws an exception with “invoke” to get exception handling to work!

Does anyone have any ideas what I might be doing wrong?

Hi Christian,

I don’t see anything in the Itanium ABI that says I need to call the function that throws an exception with “invoke” to get exception handling to work!

AFAICT, it is the design of LLVM IR and its implementation. To catch the exceptions thrown by the callee functions, we should use the invoke instruction along with the landingpad instruction. If you are calling a function with a call instruction, the LLVM backend will simply assumes frame unwinding when the callee throws an exception.

FYI, according to LLVM reference manual, an invoke-instruction has following semantics:

The ‘invoke‘ instruction causes control to transfer to a specified function, with the possibility of control flow transfer to either the ‘normal‘ label or the ‘exception‘ label. If the callee function returns with the “ret” instruction, control flow will return to the “normal” label. If the callee (or any indirect callees) returns via the “resume” instruction or other exception handling mechanism, control is interrupted and continued at the dynamically nearest “exception” label.

OTOH, here’s the description for call instruction:

The ‘call‘ instruction represents a simple function call.

Hope this is helpful.

Best regards,

Logan

Logan,

There may be some confusion here because I wasn’t completely clear about what was calling what and where “invoke” appeared to be necessary when I JIT the code and where I realize that “invoke” is absolutely necessary. As far as I understand the Itanium ABI I only need an “invoke” in function A when function B is invoked - the rest of the calls to C() and D() should be able to use “call".

Here’s an example in C++ (for illustration) and then the identical example in Common Lisp.

In Common Lisp I can compile individual expressions in the REPL.
Here is a more illustrative example. If I compile all of these functions ahead of time - everything works fine and D() and C() are called using LLVM “call”.
But if I JIT these functions one at a time it will fail. If I use llvm “invoke” to JIT the function C() - then the thrown exception will make it to A() and everything works.

// In C++
//
void D() {
throw MyException();
};

void C() {
D(); // This should be able to be called with “call”
// - but if I JIT this function it won’t work - I need “invoke” and a dummy landing pad
};

void B() {
C(); // This can be called with “call”
};

void A() {
try {
B(); // <<— This needs to be called with “invoke”
} catch ( MyException& e) {
/* Do something */
};
};

In Common Lisp:

(defun d ()
(throw 'Exception nil))

(defun c ()
(d)) ; ← This can be called with llvm “call” but if I JIT this function it will fail! I need to use “invoke”!

(defun b ()
(c)) ; ← This can be called with llvm “call”

(defun a ()
(catch 'exception
(b))) ; ← This needs to called with llvm “invoke"

Hi Christian,

Thanks for your explanation. I know your situation now. I would suggest you to check the optimization pass used by the JIT compiler, especially IPO/PruneEH.cpp. It will try to add nounwind attribute to functions which will result in the problem you have mentioned earlier.

Alternatively, as a workaround, try to add uwtable (function attribute) to the functions that are generated by your compiler. This may help as well.

Sincerely,

Logan

Logan,

Thank you very much for the feedback and guidance.

The passes that I use for JIT compilation are as follows - I don’t see prune-eh in there - could it still be happening?

(defun create-function-pass-manager-for-compile (module)
(let ((fpm (llvm-sys:make-function-pass-manager module)))
(llvm-sys:function-pass-manager-add fpm (llvm-sys:create-basic-alias-analysis-pass)) ;; <<< passes start here
(llvm-sys:function-pass-manager-add fpm (llvm-sys:create-instruction-combining-pass))
(llvm-sys:function-pass-manager-add fpm (llvm-sys:create-promote-memory-to-register-pass))
(llvm-sys:function-pass-manager-add fpm (llvm-sys:create-reassociate-pass))
(llvm-sys:function-pass-manager-add fpm (llvm-sys:create-gvnpass nil))
(llvm-sys:function-pass-manager-add fpm (llvm-sys:create-cfgsimplification-pass -1))
(llvm-sys:do-initialization fpm)
fpm))

I set things up to add the “uwtable” function attribute to every function that I generate - a current module looks like this: https://gist.github.com/drmeister/107a84d3d5023ebf13a8
On line 617 is the function (now it has the “uwtable” attribute) that is not propagating an exception thrown in cc_call on line 642.

A couple of questions: the prototype for cc_call on line 559 has the function attribute “nounwind” - I should probably remove that - correct?

When I load the bitcode file for this module and then dump it just before it is JITted, the “uwtable” attribute has disappeared - is this prune-eh doing its work even though it’s not listed above in the list of function pass managers?

I feel like there may be multiple things going on, thwarting my attempts to get these functions to propagate an exception.

Thanks,

.Chris.

Logan,

I need to make a correction of the post that preceded this one!

I was wrong when I said this:

When I load the bitcode file for this module and then dump it just before it is JITted, the “uwtable” attribute has disappeared - is this prune-eh doing its work even though it’s not listed above in the list of function pass managers?

The “uwtable” attribute does not disappear! I had forgotten to llvm-as the .ll file that I was hand-editing. (I’ve hacked my compiler so that it JITs a module from a bitcode file that I hand edit while I’m trying to sort out whether these attributes can be properly set to avoid the problem that I’m seeing).

I’m trying different combinations of function attributes to see if that fixes the problem.

Best,

.Chris.

I’ve tried every reasonable combination of disabling nounwind and enabling uwtable - it doesn’t fix the problem.

Here is a copy of the most recent Module with the uwtable attribute attached to cc_call and the function that doesn’t propagate the exception “cl->UNNAMED-LAMBDA”

https://gist.github.com/drmeister/b97dec956c6ee9ffeb75

This is the only thing that I’ve found that works in terms of getting the exception to propagate out of the JITted function - change the “call” to an “invoke” and hook in the do-nothing landing-pad.

https://gist.github.com/drmeister/7a35046f666826206973

Compare line 646 of the file above to the one that I just posted.

https://gist.github.com/drmeister/b97dec956c6ee9ffeb75

The first one propagates the exception and the second one fails.

Hi Christian,

I am not very familiar with the Mach-O file format, but may you dump the object file generated the JIT compiler pipeline and check their unwind information sections (usually .eh_frame)? Also, compare it with the output generated by the working AOT compiler pipeline. It seems it is possible that the unwind information is not properly generated or handled.

Logan

Logan,

How would I dump the object file generated by the JIT compiler pipeline?
Could you point me to an example of how something like that is done? I’m used to working with the JIT machinery in memory but not writing object files out to disk.
I’m have code to generate object files for AOT compilation - is it done the same way?

Best,

.Chris.

Hi Christian,

I usually dump the object files by hacking lib/ExecutionEngine/MCJIT/MCJIT.cpp at line 156:

std::unique_ptr MCJIT::emitObject(Module *M) {

// … skipped …

// The RuntimeDyld will take ownership of this shortly
SmallVector<char, 4096> ObjBufferSV;
raw_svector_ostream ObjStream(ObjBufferSV);

// Turn the machine code intermediate representation into bytes in memory
// that may be executed.
if (TM->addPassesToEmitMC(PM, Ctx, ObjStream, !getVerifyModules()))
report_fatal_error(“Target does not support MC emission!”);

// Initialize passes.
PM.run(*M);
// Flush the output buffer to get the generated code into memory
ObjStream.flush();

// ADD YOUR HACK HERE: The ObjBufferSV now contains the object file generated by the JIT compilation pipeline.

Sincerely,

Logan

Logan,

Thank you - this is the first time I’ve had a way of probing JITed code.

My new MCJIT::emitObject function is below. I set the environment variable to things like: JIT_DUMP=/tmp/repl%d.obj

I’ll use this to compare the unwind info in the object files generated by this code in the cases where it works and where it breaks.

I’ll get back to you with what I find.

std::unique_ptr MCJIT::emitObject(Module *M) {
MutexGuard locked(lock);

// This must be a module which has already been added but not loaded to this
// MCJIT instance, since these conditions are tested by our caller,
// generateCodeForModule.

PassManager PM;

M->setDataLayout(TM->getSubtargetImpl()->getDataLayout());
PM.add(new DataLayoutPass());

// The RuntimeDyld will take ownership of this shortly
SmallVector<char, 4096> ObjBufferSV;
raw_svector_ostream ObjStream(ObjBufferSV);

// Turn the machine code intermediate representation into bytes in memory
// that may be executed.
if (TM->addPassesToEmitMC(PM, Ctx, ObjStream, !getVerifyModules()))
report_fatal_error(“Target does not support MC emission!”);

// Initialize passes.
PM.run(*M);
// Flush the output buffer to get the generated code into memory
ObjStream.flush();

// Christian Schafmeister added hack suggested by Logan Chien to dump ObjBufferSV to a file
#if 1
static int JITFileNameIndex = 0;
std::string fileNamePrototype = getenv(“JIT_DUMP”);
if ( fileNamePrototype != “” ) {
char fileName[1024];
sprintf(fileName,fileNamePrototype.c_str(),JITFileNameIndex);
++JITFileNameIndex;
FILE* fout = fopen(fileName,“wb”);
fwrite(ObjBufferSV.data(),ObjBufferSV.size_in_bytes(),1,fout);
fclose(fout);
}
#endif

std::unique_ptr CompiledObjBuffer(
new ObjectMemoryBuffer(std::move(ObjBufferSV)));

// If we have an object cache, tell it about the new object.
// Note that we’re using the compiled image, not the loaded image (as below).
if (ObjCache) {
// MemoryBuffer is a thin wrapper around the actual memory, so it’s OK
// to create a temporary object here and delete it after the call.
MemoryBufferRef MB = CompiledObjBuffer->getMemBufferRef();
ObjCache->notifyObjectCompiled(M, MB);
}

return CompiledObjBuffer;
}

// Christian Schafmeister added hack suggested by Logan Chien to dump ObjBufferSV to a file
#if 1
static int JITFileNameIndex = 0;
std::string fileNamePrototype = getenv(“JIT_DUMP”);
if ( fileNamePrototype != “” ) {
char fileName[1024];
sprintf(fileName,fileNamePrototype.c_str(),JITFileNameIndex);
++JITFileNameIndex;
FILE* fout = fopen(fileName,“wb”);
fwrite(ObjBufferSV.data(),ObjBufferSV.size_in_bytes(),1,fout);
fclose(fout);
}
#endif