I am working on adding support for C++ exception handling when compiling for a native Windows target (that is a target with “MSVC” specified as the environment). Because of differences between how the native Windows runtime handles exceptions and the Itanium-based model used by current LLVM exception handling code, I believe this will require some extensions to the LLVM IR, though I’m trying to leverage the existing mechanisms as much as possible.
I’ll discuss this below in more detail, but the summary is that I’m going to propose an extension to the syntax of the landing pad instruction to enable landing pad clauses to be outlined as external functions, and I’d like to introduce two new intrsinsics, llvm.eh.begin.catch and llvm.eh.end.catch, to replace calls to the libc++abi __cxa_begin_catch and __cxa_end_catch functions.
Currently, LLVM supports 64-bit Windows exception handling for MinGW targets using a custom personality function and the libc++abi library. There are also LLVM clients, such as ldc, that provide Windows exception handling similar to what I am proposing by providing their own custom personality function. However, what I would like is to support Windows C++ exception handling using the __CxxFrameHandler3 function provided by the native Windows runtime library.
Some of the primary challenges in supporting native Windows C++ exception handling are:
Catch and unwind handlers are called in a different frame context than the original function in which they are defined.
Windows exception handling is state driven rather than landing pad based. The compiler must generate a table for each function mapping IP addresses within that function to the EH state at that address. When an exception is thrown the runtime uses this table to determine which unwind and catch handlers should be invoked.
Windows catch and unwind handling is implemented using a series of calls to discrete handlers rather than a jump to a landing pad which uses runtime decisions to reach all relevant handler blocks as is done in LLVM’s existing implementations. LLVM’s current landing pad structure frequently results in in catch handling blocks and cleanup blocks which are shared by multiple landing pads. Windows expects each catch handler and unwind handler to be defined in a single location. The runtime then determines which handlers should be called based on the EH state when an exception is thrown and makes a series of calls when multiple handlers are needed.
The first challenge is relatively easy to address. The Microsoft C++ compiler creates a psuedo-function for handlers which it embeds in the body of the parent function, but for LLVM I would like to try simply outlining the handler bodies into fully external functions. The task of outlining the handler code is somewhat straightforward and can be done with the existing IR. However, I need a way to link the landing pads from the parent function to the outlined handlers. I propose doing this by extending the syntax of the landing pad instruction to allow the address of an outlined handler to be attached to catch and cleanup clauses.
The current syntax for landingpad is:
= landingpad personality <pers_fn> +
= landingpad personality <pers_fn> cleanup *
I’d like to change that to:
= landingpad personality <pers_fn> +
= landingpad personality <pers_fn> cleanup [at handler] *
:= catch [at handler]
Outlined handlers will reference frame variables from the parent function using the llvm.frameallocate and llvm.framerecover functions. Any frame variable which must be referenced from a catch or cleanup handler will be moved into a block allocated by llvm.frameallocate. When the handlers are called, the parent function’s frame pointer is passed as the second argument to the call. The handlers will use this frame pointer to find the frame allocation block from the parent function. The frame allocation block will also contain space for an exception state variable and an exception object pointer. These values are maintained by the runtime library.
Current LLVM landing blocks use calls to __Cxa_begin_catch to get a pointer to the object associated with the exception. This function is provided by the libc++abi library and is specific to the personality function being used. I would like to introduce a new intrinsic (llvm.eh.being.catch) which accomplishes the same result in a personality-function independent way. For consistency, I also propose introducing llvm.eh.end.catch to replace calls to __cxa_end_catch.
I am attaching several examples showing the outlining transformation I am proposing. Note that for simplicity I’ve used Linux type information in these examples, but the final implementation will need to use Microsoft-style RTTI. I believe clang already has support for that.
The ‘simple.ll’ example shows a function with a single catch-all handler. The ‘catch-type.ll’ example shows a function which catches a specific type of exception. The ‘min-unwind.ll’ example shows a function which has no exception handlers but which requires an unwind handler. The ‘nested.ll’ example shows a function which has nested try blocks.
The nested example illustrates the challenge mentioned above with regard to inter-mingled handlers. I think I know how I will accomplish the outlining shown in that example and generate the state tables needed by the __CxxFrameHandler3 personality function, but I’m going to skip discussion of the details for now.
However, I do want to at least open discussion of the problem of EH state handling. The native Windows C++ exception handling essentially needs an EH state assigned to each basic block. I have an idea for how I might be able to infer the EH states based on the targets of invoke instructions. I think I can make this work in a way that will produce correct results for synchronous C++ exception handling. However, I don’t think I can get it to map exactly to the actual C++ scopes in the original source code. For this reason, assuming we would like to support asynchronous C++ exception handling at some future time, I think it may be preferable to have the EH states embedded by the front end, possibly as metadata. I haven’t thought through all of the possible problems here, and I am open to suggestions.
catch-type.ll (3.14 KB)
min-unwind.ll (3.79 KB)
nested.ll (12.3 KB)
simple.ll (2.06 KB)