Not stopping on EXC_BAD_ACCESS

Hello all.

I'm working on a memory manager (and garbage collector) that relies on handling hardware protection faults. That means that the programs using the GC will routinely get EXC_BAD_ACCESS faults and I want to handle them programmatically as SIGSEGV signals.

The problem is that LLDB always stops on EXC_BAD_ACCESS and there doesn't seem to be a way to continue on to the sending of a SIGSEGV and then into my exception handler. Since I expect to have thousands of protection faults, this pretty much makes my system unworkable in LLDB.

In GDB there is a setting "set dont-handle-bad-access 1" for this. I haven't been able to find an equivalent in LLDB. Also, the bug report I filed with Apple about it is still open.

Apple seem to be withdrawing GDB in Xcode 5, so I've got a problem. Up to now we just recommended using GDB.

I've taken a quick look over the LLDB source code, searched this lists archives, searched the LLVM Bugzilla, and asked around on #llvm, (and of course tried Google, Stackoverflow, and the Apple Dev Forums) but to no avail.

I'm willing to have a go at patching LLDB if someone would give me some pointers. This is a serious problem for us.

Any pointers?

References:

  - "Developers can't debug MPS on OS X" <http://www.ravenbrook.com/project/mps/issue/job001669/&gt;

  - "Passing EXC_BAD_ACCESS in lldb" <Apple Developer Forums.

  - "Debugging with the Memory Pool System" <https://www.ravenbrook.com/project/mps/master/manual/html/guide/debug.html&gt;

  - Apple Bug Report 12176156 and 7838916.

We are currently working around this problem by registering our own exception port for EXC_BAD_ACCESS using the Mach interface, overriding the debugger's port. The work can be seen here <https://github.com/Ravenbrook/mps-temporary/blob/dev/2013-06-18/macosx-threads/code/protxc.c&gt;\.

This makes our system debuggable with LLDB and removes the immediate problem, but I still think there's a flaw here. Mach exceptions don't seem to have much of an interface in LLDB, when they probably ought to have an equivalent interface to signals, allowing "nostop", "pass", etc. options.

I'll take a look into implementing this if I'm able.

Not many people actually take over their own mach exception ports, so this isn't something that hits a lot of people. It mainly hits the GC folks and the Runtime (Java) folks that need to intercept NULL derefs.

We do know about the issue and have plans to address this in the future.

The solution that works right now is to take over EXC_BAD_ACCESS on the _thread_ mach port. GDB and LLDB take over the task exception ports but we leave the thread exception ports alone. The thread mach ports will get the exception first, and if not handled, will pass it along to the task exception ports. Depending on your architecture, you might easily be able to do this, or it might be difficult.

Greg Clayton

Hi Greg. Thanks for the reply.

Just to be clear, the original problem happens for anyone *not* taking over exception ports, who just wants to handle SIGSEGV or SIGBUS and continue. This seems to be impossible with LLDB.

We have worked around this problem exactly as you say, by taking over thread exception ports. We also successfully took over the task exception port and forwarded irrelevant exceptions, but we're holding that solution in reserve, in case we have client software whose threads we can't track.

As I say, we may be able to contribute to your planned work. We'd certainly be a good test case for it. We really hammer those exceptions sometimes. So please do get in contact if you want. It's very short work to pull our software, build, and run a test case. I'll be happy to provide an up-to-date test procedure when you need it.