RFC: Exception Handling Rewrite

What? Yet another EH proposal?! This one is different from the others in that
I'm planning to start implementing this shortly. But I want your feedback! I've
all ready gotten a lot of feedback from Chris, John, Jim, Eric, and many others.
Now is your turn!

Please read this proposal and send me your comments, suggestions, and concerns.

-bw

//===----------------------------------------------------------------------===//
// LLVM Exception Handling Rewrite
//===----------------------------------------------------------------------===//

7/16/2011 - Initial revision
7/18/2011 - Chris's feedback incorporated
7/19/2011 - John's feedback incorporated
7/22/2011 - Final revision: Chris's feedback incorporated

The current exception handling system works for the most part. We have been able
to work with our existing implementation now and get it to produce exception
handling code for a wide variety of situations. However, we are reaching the
limit of what we are able to support with it. In particular, it suffers from
these deficiencies:

1. It's very hard to perform inlining through an `invoke' instruction. Because
   the information is stored in an intrinsic, it's very difficult to merge one
   function's exception handling with another.
2. The EH intrinsics, which contain the exception handling information for an
   invoke (e.g., 'llvm.eh.exception' and 'llvm.eh.selector'), can move out of
   the landing pad during normal code motion thus making retrieving EH
   information very difficult.
3. We are currently able to inline two functions which have incompatible
   personality functions. This breaks the semantics of the original program.
4. The exception handling ABI is not followed. Instead, we approximate it using
   catchalls and calls to non-standard APIs (e.g., '_Unwind_Resume_or_Rethrow').
5. It's inefficient. Because of the constant rethrowing of exceptions, a normal
   exception takes much longer to execute.

In order to address these issues, we need a much better way to represent
exceptions in LLVM IR.

//===----------------------------------------------------------------------===//
// Proposal
//===----------------------------------------------------------------------===//

We start with the existing system and try to modify it to eliminate its negative
aspects. This has many benefits, not the least of which is that it's easier for
existing front-ends to adopt the new exception handling design.

The heart of the proposal is to directly associate unwinding information for an
invoke with the invoke, and to directly expose the values produced in the
landing pad. We do this by introducing a new 'landingpad' instruction which is
always required to the first non-phi instruction in the 'unwind' block of a
landing pad block. Because of this direct association, it is always possible to
find the invoke for a landing pad, and always possible to find the landing pad
for an invoke.

The 'landingpad' instruction is an instruction (not an intrinsic) because it
has a variadic but highly structured argument list, and can return arbitrary
types (specified by the personality function and ABI).

//===--------------------------
// The 'landingpad' Instruction
//

The 'landingpad' instruction replaces the current 'llvm.eh.exception' and
'llvm.eh.selector' intrinsics.

// Syntax:

  %res = landingpad <somety> personality <ty> <pers_fn> <clause>+

where

  <clause> :=
       cleanup
    > catch <ty_1>, <ty_2>, ..., <ty_n>
    > filter <ty_1>, <ty_2>, ..., <ty_m>

and the result has the type '<somety>'. The personality functions must be the
same for all landingpad instructions in a given function.

A landingpad instruction must contain at least one cleanup, catch, or filter
clause.

// Restrictions:

There are several new invariants which will be enforced by the verifier:

1. A landing pad block is a basic block which is the unwind destination of an
   invoke instruction.
2. A landing pad block must have a landingpad instruction as its first non-PHI
   instruction.
3. The landingpad instruction must be the first non-PHI instruction in the
   landing pad block.
4. Like indirect branches, splitting the critical edge to a landing pad block
   requires considerable care, and SplitCriticalEdge will refuse to do it.
5. All landingpad instructions in a function must have the same personality
   function.

// Semantics:

The landingpad instruction defines the values which are set by the personality
function upon reentry to the function, and therefore the "result type" of the
landing pad instruction. With these changes, LLVM IR will be able to represent
unusual personality functions that could return things in 6 registers for
example. As with calling conventions, how the personality function results are
represented in LLVM IR is target specific.

// Examples:

  ;; A landing pad which can catch an integer or double and which can throw only
  ;; a const char *.
  %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
           catch i8** @_ZTIi, i8** @_ZTId
           filter i8** @_ZTIPKc

  ;; A landing pad that is a cleanup.
  %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
           cleanup

  ;; A landing pad which indicates that the personality function should call the
  ;; terminate function.
  %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
           terminate

//===--------------------------
// The 'resume' Instruction
//

The new "resume" instruction replaces the 'llvm.eh.resume' intrinsic and the old
"unwind" instruction. The "unwind" instruction will be removed.

// Syntax:

  resume <somety> <op>

This is a terminator instruction that has no successors. Its operand must have
the same type as the result of any landingpad instructions in the same function.

// Semantics:

Resumes propagation of an existing (in-flight) exception.

// Example:

  resume { i8*, i32 } %eh.val ;; Resume exeception propagation out of
                                ;; current function.

Note that there is no way with this proposal for pure IR to actually start an
exception throw. Instead, use of __cxa_throw or something similar is required.

//===----------------------------------------------------------------------===//
// Inlining
//===----------------------------------------------------------------------===//

During inlining through an invoke instruction, a 'resume' instruction will be
replaced by a branch to the instruction immediately after the landingpad
instruction associated with the invoke. The landingpad instructions, which
supply the argument to the 'resume', will be updated with new types which they
can catch, if any, from the invoke's landing pad block.

The inliner will refuse to inline two functions which have landingpad
instructions with incompatible personality functions. For example, an Ada
function cannot be inlined into a C++ function, because Ada uses an incompatible
personality function. However, a C++ function may be inlined into an Objective-C
function (and vice-versa), because the Objective-C personality function contains
all of the functionality of the C++ personality function.

This proposal does not include a way for the optimizer to know that a superset
relation exists, that can be added in the future with a named MDNode. For now,
all such inlinings will be refused.

//===----------------------------------------------------------------------===//
// Cleanup and Catch Clauses
//===----------------------------------------------------------------------===//

If there is a cleanup that's inlined into a try-catch block, the exception
handling table will still mark it as a cleanup, but will also indicate that it
catches specific types. For example:

struct A {
  A();
  ~A();
};

void qux();
void bar() {
  A a;
  qux();
}

void foo() {
  try {
    bar();
  } catch (int) {
  }
}

If the call to bar is inlined into foo, the landing pads for A's constructor and
destructor are marked as catching an int. The landing pad for qux(), which is
marked as a cleanup in bar(), will remain marked as a cleanup, but also be
marked as catching an int.

//===----------------------------------------------------------------------===//
// Future Landing Pad Instruction Optimizations
//===----------------------------------------------------------------------===//

In the future, the landing pad instruction could be modified to aide
optimizations. E.g., if the personality function can support it, a landingpad
instruction may indicate that the personality function should call the
'terminate' function:

  %res = landingpad <somety> personality <ty> <pers_fn>
           terminate

This landingpad instruction would be followed by an 'unreachable'
instruction. The exception handling table would be set up to have the
personality function call the appropriate 'terminate' function.

Support for such features would be done on a case-by-case basis.

//===----------------------------------------------------------------------===//
// Examples
//===----------------------------------------------------------------------===//

1) A simple example:

#include <cstdio>

void bar();

void foo() throw (const char *) {
  try {
    bar();
  } catch (int i) {
    printf("caught integer %d\n", i);
  } catch (double d) {
    printf("caught double %g\n", d);
  }
}

Produces:

@_ZTIPKc = external constant i8*
@_ZTIi = external constant i8*
@_ZTId = external constant i8*
@.str = private unnamed_addr constant [19 x i8] c"caught integer %d\0A\00", align 1
@.str1 = private unnamed_addr constant [18 x i8] c"caught double %g\0A\00", align 1

define i32 @_Z3foov() uwtable optsize ssp {
entry:
  invoke void @_Z3barv() optsize
          to label %try.cont unwind label %lpad

invoke.cont7:
  %tmp0 = tail call i8* @__cxa_begin_catch(i8* %exn) nounwind
  %tmp1 = bitcast i8* %tmp0 to i32*
  %exn.scalar = load i32* %tmp1, align 4
  %call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([19 x i8]* @.str, i64 0, i64 0),
                                            i32 %exn.scalar) optsize
  tail call void @__cxa_end_catch() nounwind
  br label %try.cont

invoke.cont20:
  %tmp2 = tail call i8* @__cxa_begin_catch(i8* %exn) nounwind
  %tmp3 = bitcast i8* %tmp2 to double*
  %exn.scalar11 = load double* %tmp3, align 8
  %call21 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([18 x i8]* @.str1, i64 0, i64 0),
                                              double %exn.scalar11) optsize
  tail call void @__cxa_end_catch() nounwind
  br label %try.cont

try.cont:
  ret i32 undef

lpad:
  %exn.val = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
               catch i8** @_ZTIi, i8** @_ZTId
               filter i8** @_ZTIPKc
  %exn = extractvalue { i8*, i32 } %exn.val, 0
  %sel = extractvalue { i8*, i32 } %exn.val, 1
  %tmp4 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) nounwind
  %tmp5 = icmp eq i32 %sel, %tmp4
  br i1 %tmp5, label %invoke.cont7, label %eh.next

eh.next:
  %tmp6 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTId to i8*)) nounwind
  %tmp7 = icmp eq i32 %sel, %tmp6
  br i1 %tmp7, label %invoke.cont20, label %eh.next1

eh.next1:
  %ehspec.fails = icmp slt i32 %sel, 0
  br i1 %ehspec.fails, label %ehspec.unexpected, label %eh.resume

eh.resume:
  resume { i8*, i32 } %exn.val

ehspec.unexpected:
  tail call void @__cxa_call_unexpected(i8* %exn) noreturn
  unreachable
}

2) An example of inlining:

void qux();

void bar() __attribute__((always_inline));
void bar() {
  try {
    qux();
  } catch (char c) {
    printf("caught char %c\n", c);
  }
}

void foo() throw (const char *) {
  try {
    bar();
  } catch (int i) {
    printf("caught integer %d\n", i);
  } catch (double d) {
    printf("caught double %g\n", d);
  }
}

Produces (see comments inline):

@_ZTIc = external constant i8*
@.str = private unnamed_addr constant [16 x i8] c"caught char %c\0A\00", align 1
@_ZTIPKc = external constant i8*
@_ZTIi = external constant i8*
@_ZTId = external constant i8*
@.str1 = private unnamed_addr constant [19 x i8] c"caught integer %d\0A\00", align 1
@.str2 = private unnamed_addr constant [18 x i8] c"caught double %g\0A\00", align 1

define void @_Z3barv() uwtable optsize alwaysinline ssp {
entry:
  invoke void @_Z3quxv() optsize
          to label %try.cont unwind label %lpad

try.cont:
  ret void

lpad:
  %exn.val = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
               catch i8** @_ZTIc
  %exn = extractvalue { i8*, i32 } %exn.val, 0
  %sel = extractvalue { i8*, i32 } %exn.val, 1
  %tmp2 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIc to i8*)) nounwind
  %tmp3 = icmp eq i32 %sel, %tmp2
  br i1 %tmp3, label %invoke.cont4, label %eh.resume

invoke.cont4:
  %tmp0 = tail call i8* @__cxa_begin_catch(i8* %exn) nounwind
  %exn.scalar = load i8* %tmp0, align 1
  %conv = sext i8 %exn.scalar to i32
  %call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([16 x i8]* @.str, i64 0, i64 0), i32 %conv) optsize
  tail call void @__cxa_end_catch() nounwind
  br label %try.cont

eh.resume:
  resume { i8*, i32 } %exn.val
}

define i32 @_Z3foov() uwtable optsize ssp {
entry:
  invoke void @_Z3quxv() optsize
          to label %try.cont.bar unwind label %lpad.bar

try.cont.bar:
  ret void

lpad.bar:
  ;; bar's landing pad is updated to indicate that it can catch an int or double
  ;; exception.
  %exn.val.bar = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
               catch i8** @_ZTIc, i8** @_ZTIi, i8** @_ZTId
  %exn.bar = extractvalue { i8*, i32 } %exn.val.bar, 0
  %sel.bar = extractvalue { i8*, i32 } %exn.val.bar, 1
  %tmp2.bar = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIc to i8*)) nounwind
  %tmp3.bar = icmp eq i32 %sel, %tmp2.bar
  br i1 %tmp3.bar, label %invoke.cont4.bar, label %eh.resume.bar

invoke.cont4.bar:
  %tmp0.bar = tail call i8* @__cxa_begin_catch(i8* %exn) nounwind
  %exn.scalar.bar = load i8* %tmp0.bar, align 1
  %conv.bar = sext i8 %exn.scalar.bar to i32
  %call.bar = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([16 x i8]* @.str, i64 0, i64 0), i32 %conv.bar) optsize
  tail call void @__cxa_end_catch() nounwind
  br label %try.cont.bar

eh.resume.bar:
  ;; bar's 'resume' instruction was replaced by an unconditional branch to the
  ;; foo's landing pad.
  br label %lpad.split

invoke.cont7:
  %tmp0 = tail call i8* @__cxa_begin_catch(i8* %exn) nounwind
  %tmp1 = bitcast i8* %tmp0 to i32*
  %exn.scalar = load i32* %tmp1, align 4
  %call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([19 x i8]* @.str, i64 0, i64 0),
                                            i32 %exn.scalar) optsize
  tail call void @__cxa_end_catch() nounwind
  br label %try.cont

invoke.cont20:
  %tmp2 = tail call i8* @__cxa_begin_catch(i8* %exn) nounwind
  %tmp3 = bitcast i8* %tmp2 to double*
  %exn.scalar11 = load double* %tmp3, align 8
  %call21 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([18 x i8]* @.str1, i64 0, i64 0),
                                              double %exn.scalar11) optsize
  tail call void @__cxa_end_catch() nounwind
  br label %try.cont

try.cont:
  ret i32 undef

lpad:
  %exn.val = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
               catch i8** @_ZTIi, i8** @_ZTId
               filter i8** @_ZTIPKc
  br label %lpad.split

lpad.split:
  ;; foo's landing pad is split after the 'landingpad' instruction so that bar's
  ;; 'landingpad' can continue processing the exception if it wasn't handled in
  ;; bar.
  %exn.val.phi = phi { i8*, i32 } [ %exn.val, %lpad ], [ %exn.val.bar, %lpad.bar ]
  %exn = extractvalue { i8*, i32 } %exn.val.phi, 0
  %sel = extractvalue { i8*, i32 } %exn.val.phi, 1
  %tmp4 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) nounwind
  %tmp5 = icmp eq i32 %sel, %tmp4
  br i1 %tmp5, label %invoke.cont7, label %eh.next

eh.next:
  %tmp6 = tail call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTId to i8*)) nounwind
  %tmp7 = icmp eq i32 %sel, %tmp6
  br i1 %tmp7, label %invoke.cont20, label %eh.next1

eh.next1:
  %ehspec.fails = icmp slt i32 %sel, 0
  br i1 %ehspec.fails, label %ehspec.unexpected, label %eh.resume

eh.resume:
  resume { i8*, i32 } %exn.val.phi

ehspec.unexpected:
  tail call void @__cxa_call_unexpected(i8* %exn) noreturn
  unreachable
}

Could we add:

  • A landing pad block is not the destination of any other kind of terminator. Only unwind edges are allowed.

  • The landingpad instruction must only appear at the top of a landing pad. It cannot appear in any other block, or following non-phi instructions.

Why won’t SplitCriticalEdge work for landing pads? Does it require more than splitting the landing pad after the landingpad instruction, and duplicating the top half? Alternatively, could we get a SplitLandingPad function?

Will it be possible to split landing pads during codegen?

/jakob

// Restrictions:

There are several new invariants which will be enforced by the verifier:

  1. A landing pad block is a basic block which is the unwind destination of an
    invoke instruction.
  2. A landing pad block must have a landingpad instruction as its first non-PHI
    instruction.
  3. The landingpad instruction must be the first non-PHI instruction in the
    landing pad block.
  4. Like indirect branches, splitting the critical edge to a landing pad block
    requires considerable care, and SplitCriticalEdge will refuse to do it.
  5. All landingpad instructions in a function must have the same personality
    function.

Could we add:

  • A landing pad block is not the destination of any other kind of terminator. Only unwind edges are allowed.

How do we lower SjLj exceptions as an IR → IR pass then?

  • The landingpad instruction must only appear at the top of a landing pad. It cannot appear in any other block, or following non-phi instructions.

How does this differ from 2&3 above?

Cameron

// Restrictions:

There are several new invariants which will be enforced by the verifier:

  1. A landing pad block is a basic block which is the unwind destination of an
    invoke instruction.
  2. A landing pad block must have a landingpad instruction as its first non-PHI
    instruction.
  3. The landingpad instruction must be the first non-PHI instruction in the
    landing pad block.
  4. Like indirect branches, splitting the critical edge to a landing pad block
    requires considerable care, and SplitCriticalEdge will refuse to do it.
  5. All landingpad instructions in a function must have the same personality
    function.

Could we add:

  • A landing pad block is not the destination of any other kind of terminator. Only unwind edges are allowed.

How do we lower SjLj exceptions as an IR → IR pass then?

I don’t know. What does the landingpad instruction return when you branch to a landing pad?

A landing pad must follow some ABI convention, it represents the other return value of an invoke instruction.

SjLj is weird. How do we pass values along unwind edges today? Don’t they need to dominate the setjmp call?

  • The landingpad instruction must only appear at the top of a landing pad. It cannot appear in any other block, or following non-phi instructions.

How does this differ from 2&3 above?

It probably doesn’t, it wasn’t completely clear to me.

/jakob

We pass them through memory:

// If we decided we need a spill, do it.
// FIXME: Spilling this way is overkill, as it forces all uses of
// the value to be reloaded from the stack slot, even those that aren’t
// in the unwind blocks. We should be more selective.
if (NeedsSpill) {
++NumSpilled;
DemoteRegToStack(*Inst, true);
}

So after SjLjEHPrepare, the invokes should probably be turned into calls and the unwind edges removed.

The unwind edges don’t represent control flow anymore, and they are not needed for dominance. They are only used for:

// We still want this to look like an invoke so we emit the LSDA properly,
// so we don’t transform the invoke into a call here.

It is possible we could feed all of them to a dummy landing pad that then branches to setjmp’s second return?

/jakob

// Restrictions:

There are several new invariants which will be enforced by the verifier:

1. A landing pad block is a basic block which is the unwind destination of an
  invoke instruction.
2. A landing pad block must have a landingpad instruction as its first non-PHI
  instruction.
3. The landingpad instruction must be the first non-PHI instruction in the
  landing pad block.
4. Like indirect branches, splitting the critical edge to a landing pad block
  requires considerable care, and SplitCriticalEdge will refuse to do it.
5. All landingpad instructions in a function must have the same personality
  function.

Could we add:

- A landing pad block is not the destination of any other kind of terminator. Only unwind edges are allowed.

- The landingpad instruction must only appear at the top of a landing pad. It cannot appear in any other block, or following non-phi instructions.

Most of this is covered by 2&3. But it would be good to explicitly state that a landingpad instruction can appear only in a landing pad block.

Why won't SplitCriticalEdge work for landing pads? Does it require more than splitting the landing pad after the landingpad instruction, and duplicating the top half? Alternatively, could we get a SplitLandingPad function?

Splitting a critical edge, especially in this case, isn't necessary for correctness. It's an optimization. In general, the use of a value in the catch blocks will access that value via memory. It also complicates the SplitCriticalEdge function, like you outlined.

Will it be possible to split landing pads during codegen?

Split a landing pad or split the critical edges to a landing pad? If the former, then yes. You can also split a landing pad in LLVM IR. You just can't split the landing pad block before the landingpad instruction.

-bw

The personality function sets the frame pointer (r7) and stores the value of which call has the exception onto the stack (the program stores the call's value onto the stack before each call is made...making it very efficient, of course). All of that stuff is done away from the landing pad blocks. More precisely, the personality function reenters the function after the setjmp (like you would expect). It then goes to a jump table, where it reads the value the personality function stored and jumps to the correct place via a jump table. We currently keep the llvm.eh.selector calls around so that CodeGen will be able to gather the exception handling information from it. But otherwise, it doesn't look like its used. It's all pretty gross.

-bw

Could we add:

- A landing pad block is not the destination of any other kind of terminator. Only unwind edges are allowed.

- The landingpad instruction must only appear at the top of a landing pad. It cannot appear in any other block, or following non-phi instructions.

Most of this is covered by 2&3. But it would be good to explicitly state that a landingpad instruction can appear only in a landing pad block.

Right. I think we agree on the intention. I wanted to make it completely clear that:

- All unwind edges go to a block with a landingpad instruction.
- All non-unwind edges go to a block without a landingpad instruction.

That then implies that unwind edges can't be split and landingpad instructions can't be moved.

Why won't SplitCriticalEdge work for landing pads? Does it require more than splitting the landing pad after the landingpad instruction, and duplicating the top half? Alternatively, could we get a SplitLandingPad function?

Splitting a critical edge, especially in this case, isn't necessary for correctness. It's an optimization.

Yes. You scared me with 'requires considerable care'. Does that mean anything other than 'you have to duplicate the landing pad instead of splitting the unwind edge'. Is special magic required to duplicate a landingpad instruction?

In general, the use of a value in the catch blocks will access that value via memory.

Because the register allocator will spill, or because mem2reg fails? Won't there be a lot of IR values live across an unwind edge after mem2reg?

It also complicates the SplitCriticalEdge function, like you outlined.

Yes, I agree. Duplicating a landing pad is different than splitting a critical edge.

Will it be possible to split landing pads during codegen?

Split a landing pad or split the critical edges to a landing pad?

Sorry, I meant duplicating a landing pad.

Landing pads notoriously collect critical edges. We should make sure there is some way of dealing with that, unlike indirectbr edges. The possibility of splitting and duplicating landing pads would help.

/jakob

Hi Bill,

Thanks for working on this.

Is there a reference for the function attribute uwtable, or is it to be defined as
part of this effort?

Thanks in advance

Garrison

It already exists; there's some limited documentation in the LLVM
source, but Rafael apparently forgot to add it to LangRef...

-Eli

[...]

//===--------------------------
// The 'landingpad' Instruction
//

The 'landingpad' instruction replaces the current 'llvm.eh.exception' and
'llvm.eh.selector' intrinsics.

// Syntax:

%res = landingpad <somety> personality <ty> <pers_fn> <clause>+

where

<clause> :=
cleanup
> catch <ty_1>, <ty_2>, ..., <ty_n>
> filter <ty_1>, <ty_2>, ..., <ty_m>

terminate ? You have an example referencing it, but it isn't in the grammar.

-Eli

Sorry for the confusion. The terminate is addressed later and I mention it as a potential optimization that could be added at a future time. But not for the initial rewrite. If it's in one of the examples, then please ignore it. I'll come up with a few other examples.

-bw

Hi Eli

So I found this in Attributes.h:

const Attributes UWTable = 1<<30; ///< Function must be in a unwind
                                          ///table

What does this mean? In particular what does it mean not to add this as
a function attribute to a function? I'm obviously going down the wrong road in
my interpretation, as I currently have functions that unwind from, through, and to
without using this attribute. Does this have meaning for certain platforms and
thus must always be used "just in case"?

Thanks in advance

Garrison

It seems to be used by the back-ends to generate the correct prolog information (on X86) and to make sure that the exception tables are generated (on ARM). Possibly for others. If you're not using it, you may want to verify that you have the correct EH frame information. You can do this on Darwin with the "dwarfdump --eh-frame ./a.out" command.

-bw

Could we add:

- A landing pad block is not the destination of any other kind of terminator. Only unwind edges are allowed.

- The landingpad instruction must only appear at the top of a landing pad. It cannot appear in any other block, or following non-phi instructions.

Most of this is covered by 2&3. But it would be good to explicitly state that a landingpad instruction can appear only in a landing pad block.

Right. I think we agree on the intention. I wanted to make it completely clear that:

- All unwind edges go to a block with a landingpad instruction.
- All non-unwind edges go to a block without a landingpad instruction.

That then implies that unwind edges can't be split and landingpad instructions can't be moved.

Why won't SplitCriticalEdge work for landing pads? Does it require more than splitting the landing pad after the landingpad instruction, and duplicating the top half? Alternatively, could we get a SplitLandingPad function?

Splitting a critical edge, especially in this case, isn't necessary for correctness. It's an optimization.

Yes. You scared me with 'requires considerable care'. Does that mean anything other than 'you have to duplicate the landing pad instead of splitting the unwind edge'. Is special magic required to duplicate a landingpad instruction?

There shouldn't be any special magic involved. As you pointed out, we'd have to duplicate the landingpad instruction into each of the critical edge blocks.

In general, the use of a value in the catch blocks will access that value via memory.

Because the register allocator will spill, or because mem2reg fails? Won't there be a lot of IR values live across an unwind edge after mem2reg?

The register allocator should be spilling most variables across the unwind edge. The only ones which can stick around are those that are placed in non-volatile registers. This is all opaque to the IR, which doesn't know that the unwind edge is special. This is why we can have PHI nodes in the landing pad. I'm assuming that it gets it "right" because it's keeping values alive across the invoke.

It also complicates the SplitCriticalEdge function, like you outlined.

Yes, I agree. Duplicating a landing pad is different than splitting a critical edge.

Will it be possible to split landing pads during codegen?

Split a landing pad or split the critical edges to a landing pad?

Sorry, I meant duplicating a landing pad.

Landing pads notoriously collect critical edges. We should make sure there is some way of dealing with that, unlike indirectbr edges. The possibility of splitting and duplicating landing pads would help.

One way to get around this is to do what Andy suggested to me at one point (not on email). We could have one landing pad block per invoke instruction. So no two invokes could ever share a landing pad (and of course the landingpad instruction). (My apologies to Andy if I misrepresented his idea.) I think of this as overkill, but if you feel strongly that not being able to break critical edges is going to hamper code-gen significantly we can discuss this.

-bw

Yes. You scared me with 'requires considerable care'. Does that mean anything other than 'you have to duplicate the landing pad instead of splitting the unwind edge'. Is special magic required to duplicate a landingpad instruction?

There shouldn't be any special magic involved. As you pointed out, we'd have to duplicate the landingpad instruction into each of the critical edge blocks.

That sounds good to me. It's not necessary that SplitCriticalEdge can do it, I just want to be able to duplicate a landing pad without knowing the details of any particular personality function.

In general, the use of a value in the catch blocks will access that value via memory.

Because the register allocator will spill, or because mem2reg fails? Won't there be a lot of IR values live across an unwind edge after mem2reg?

The register allocator should be spilling most variables across the unwind edge. The only ones which can stick around are those that are placed in non-volatile registers. This is all opaque to the IR, which doesn't know that the unwind edge is special. This is why we can have PHI nodes in the landing pad. I'm assuming that it gets it "right" because it's keeping values alive across the invoke.

Yep, this should all just work. Unwind edges are just like any other critical edge, and the unwinder preserves the non-volatiles.

There is an optimization issue because once you stick a value in a non-volatile register across an unwind edge, that value must be in the same register before every call that shares the landing pad. That limits what live range splitting can do.

But this is really the register allocator's problem. I just want to make sure that it will be possible to duplicate those landing pads when they cause problems.

It also complicates the SplitCriticalEdge function, like you outlined.

Yes, I agree. Duplicating a landing pad is different than splitting a critical edge.

Will it be possible to split landing pads during codegen?

Split a landing pad or split the critical edges to a landing pad?

Sorry, I meant duplicating a landing pad.

Landing pads notoriously collect critical edges. We should make sure there is some way of dealing with that, unlike indirectbr edges. The possibility of splitting and duplicating landing pads would help.

One way to get around this is to do what Andy suggested to me at one point (not on email). We could have one landing pad block per invoke instruction. So no two invokes could ever share a landing pad (and of course the landingpad instruction). (My apologies to Andy if I misrepresented his idea.) I think of this as overkill, but if you feel strongly that not being able to break critical edges is going to hamper code-gen significantly we can discuss this.

It would certainly make many things easier, but I don't think we properly understand the code size impact of doing that. Low-frequency blocks without critical edges are like catnip to the splitter. It will put all the spill code in there, and we may not be able to merge those landing pads again.

I would like to keep the option, but I don't think we should aggressively duplicate landing pads.

I want to be able to split and duplicate landing pads on demand in IR and in MI. That is just as good as splitting critical edges. It sounds like your proposal allows this. We can discuss the MI representation when the time comes.

/jakob

It should be possible to manually break the critical edges, split the landing pad, dup the landingpad instruction, etc. So yes, this is a happy medium. Sorry for the initial confusion. :slight_smile:

-bw

It already exists; there's some limited documentation in the LLVM
source, but Rafael apparently forgot to add it to LangRef...

Sorry about that. I will patch it.

-Eli

Cheers,
Rafael

   %exn.val = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0
                catch i8** @_ZTIi, i8** @_ZTId
                filter i8** @_ZTIPKc
   br label %lpad.split

What is the semantics of filter? Is it undefined reference if an exception not matching ZTIi, ZTId or ZTIPKc passes by?

Cheers,
Rafael

Thanks Rafael

Garrison