Generating a backtrace

After working with LLVM for several years now, one problem that remains unsolved is how to generate a stack backtrace for exceptions.

My basic approach is simple: I have two different exception personality functions, a “lightweight” one that just does the bare minimum needed to handle exception, and a “capturing” one that (ideally) records the call frame information as it unwinds the stack. My motivation for doing it this way is that it would be too expensive to always capture call frame information on every exception, so instead my compiler only uses the heavier personality function when the exception backtrace information is actually going to be used.

Within the personality function, there’s a call to _Unwind_Backtrace(), which walks through the list of call frames and calls a callback for each one. Within the callback, I can get the value of the return address for each call frame using _Unwind_GetIP(). So far so good.

The problem is converting those addresses into meaningful symbols. For some reason that I don’t understand, dladdr() doesn’t seem to work on LLVM-generated functions, even though I know those functions have full DWARF debugging information. If I insert a printf into my backtrace code, and print out the addresses of each return address I see something like this:

Function _Unwind_RaiseException
Function __libc_start_main

The hex values are ones where dladdr() failed to provide a function name. As you can see, the only functions it was able to deal with are the libc startup function, and _Unwind_Raise_Exception itself. Yet I know these functions have symbolic names, since I can step through them in gdb, set breakpoints, and so on.

I’ve tried a number of other approaches: Calling dlopen(NULL) and then using dlsym() to try and locate __data_start so that I can then attempt to manually parse the DWARF debug frames to translate the return addresses into function names. Unfortunately, I can’t seem to locate __data_start at all. I’ve also tried calling the libc backtrace() function, but it produced similarly useless results.

The really icky part about all of this is that even if I do come up with a solution for these problems, I will then have to re-solve the same problems for each different platform that LLVM supports. I kinda wish that there was some LLVM intrinsic or library function that would hide all these details from me :slight_smile:

Well, since there was no response, I guess I’ll just have to reply to myself :slight_smile:

The whole “walking the stack” thing reminds me of something else I would like to see in LLVM. In the document on garbage collection, it says:

This so-called “shadow stack” mirrors the machine stack. Maintaining this data structure is slower than using a stack map compiled into the executable as constant data, but has a significant portability advantage because it requires no special support from the target code generator, and does not require tricky platform-specific code to crawl the machine stack.

The problem is that writing that “tricky platform-specific code” is required if you want to have multiple threads - shadow stack modifies a global every call frame, and that won’t work in a threaded environment. Unfortunately writing that code is also completely beyond the abilities of the typical LLVM user. Well, it’s certainly beyond me anyway.

Similarly, the assembly language reference manual has this to say about the llvm.returnaddress and llvm.frameaddress intrinsics:

The value returned by this intrinsic is likely to be incorrect or 0 for arguments other than zero, so it should only be used for debugging purposes.

Seems to me that these two problems are really the same - the general inability to introspect the stack in a reliable way. I realize that this is a hard problem in the general case, but I’d be willing to turn on some compiler option that generated (very slightly) less efficient code, if it would give me the ability to crawl the stack deterministically, and allow a stack map entry to be identified for every call frame without the cost of having to modify a global linked list every call.

This has always seemed to me like one area where the LLVM machine abstraction is sadly incomplete. Most of the time I can blissfully generate IR without having to think about all the gritty platform details. But when it comes to looking at the stack, all of a sudden I’m forced to deal with all of the chaos of the underlying architectures that, up to that point, LLVM had so gracefully covered up for me.