RFC: Exception Handling Rewrite

Chris,
it is goodness that the LandingpadInst will be pinned to the beginning
of a BasicBlock,… except for the possibility of PHINode instructions that must
come even earlier.?.

I can’t exactly put my finger on what’s going to go wrong with this,
but it sure smells fishy…

my current understanding is that the LandingpadInst will “define” some hard
registers which will be used by following code to switch to the corresponding
catch-clause

the lifetimes of these hard registers ostensibly starts at the LandingpadInst,
but for purposes of PHI lowering and Register Allocation they must actually
start at the beginning of the BasicBlock – since that is where control flow will
return to from the _Unwind_RaiseException / __gcc_personality_v0 calls,
and it is the Unwind and personality functions that physically set those
hard registers, not the “LandingpadInst”.

Somehow PHI lowering and register allocation need to be prohibited from
using those hard registers for spill code at the beginning of a “landing pad block”,
but I don’t see how that will “fall out” of the current design.?.

-Peter Lawrence.

Hi Peter,

Thanks for pointing this out. Some us who are concerned with codegen have discussed the problem. Although the details aren’t decided, you can be sure that at the MachineInstr level we won’t have a landindpadInst to model liveness of exception values. Any physical registers set by the personality function will be considered live immediately after the call on the unwind path.

-Andy

Andrew,
yes, my brain-bad, soon after I hit the Send button I realized it
is the InvokeInst that starts the lifetime of those hard-registers, not
the LandingpadInst, but you beat me to the reply.

-Peter Lawrence.

Right.

My intention here is to mark the CALL instruction as a terminator:

BB1:
  %R0 = COPY %vreg0; 1st arg
  %R1 = COPY %vreg1; 2nd arg
  CALL foo
Successors: BB2, BB3

BB2:
Live-in: %R0
  %vreg2 = COPY %R0; Return value
  BR BB7

BB3: LANDING PAD
Live-in: %R0, %R1
  %vreg10 = COPY %R0; Ex addr
  %vreg11 = COPY %R1; EH selector

BB2 must be a layout successor of BB1, but we can already handle that.

This code layout also clearly marks the call that may unwind: It is the terminator with an edge to a landing pad. That means we can get rid of the EH_LABELS we currently use to mark the call.

/jakob

Yep, the proposal only covers the IR representation. Things are much different at the MachineInstr level, and I have much less of an opinion on how that is modeled.

-Chris

Guys,
on second thought…

doesn’t making the exception registers live from the InvokeInst to the LandingpadInst
create problems for critical-edge-splitting ?

if a landingpad-edge is critical and needs to be split, won’t we be creating and inserting
a new BB between the “invoke-block” and the “landingpad-block”, and if we do then
isn’t there the possibility of the register allocator spilling the contents of the exception
registers from within the newly created block — but this block won’t ever get executed
because the Unwind / personality functions will cause control flow to go directly
to the block with the LandingpadInst ? If you really want to split a landingpad-edge
won’t you have to move the LandingpadInst up into the new block ?

if this is true (and I seem to be making a lot of logic errors lately, so maybe reread and
proof-read the above a few times…!..) then don’t we need to add another invariant to
Bill’s list:

*) there can be no code between an InvokeInst and its LandingpadInst other than
possibly PHINodes at the beginning of the landingpad-block.

It seems that if you really do need to split a landingpad-edge then you have to move
the LandingpadInst up into (the beginning of) the new block.

However it seems that if a landingpad-block has multiple predecessors (often the case,
multiple InvokeInst in the main body of a try-statement all go to the same landingpad-
block), then you cannot move the LandingpadInst in order to break a critical edge unless
you do it for all landingpad-block predecessor edges simultaneously, but that seems
to be a messy conclusion (being forced to split other edges that don’t need to be split).

my first guess is that all the nuances of whether it ever makes sense and/or is even
logically possible to split a critical landingpad-edge won’t be discovered except by
painful trial-and-error, and that it might be best to at first disallow it until proven doable
by someone working in an isolated branch – although proving it works may be difficult,
since so little code actually uses exceptions (only TableGen in llvm ?).

-Peter Lawrence.

Peter,

I think this will be done lazily to avoid excessive splitting as in:

Call1 → LP
Call2 → LP
Call3 → LP

=> split Call1

Call1 → LP-split → LP-remainder
Call2 → LP-split-merge → LP-remainder
Call3 → LP-split-merge → LP-remainder

But John will know best.

-Andy

Guys,
on second thought…

doesn’t making the exception registers live from the InvokeInst to the LandingpadInst
create problems for critical-edge-splitting ?

if a landingpad-edge is critical and needs to be split, won’t we be creating and inserting
a new BB between the “invoke-block” and the “landingpad-block”, and if we do then
isn’t there the possibility of the register allocator spilling the contents of the exception
registers from within the newly created block — but this block won’t ever get executed
because the Unwind / personality functions will cause control flow to go directly
to the block with the LandingpadInst ? If you really want to split a landingpad-edge
won’t you have to move the LandingpadInst up into the new block ?

You cannot split critical edges to landing pads, but you can duplicate landing pads and distribute the predecessors any way you like.

The requirement is that all unwind edges go to landing pads and all non-unwind edges go to non-landing pads. That will also be required in the code generator.

if this is true (and I seem to be making a lot of logic errors lately, so maybe reread and
proof-read the above a few times…!..) then don’t we need to add another invariant to
Bill’s list:

*) there can be no code between an InvokeInst and its LandingpadInst other than
possibly PHINodes at the beginning of the landingpad-block.

I think this covers it:

  • A landing pad block must have a ‘landingpad’ instruction as its first non-PHI instruction.
  • There can be only one ‘landingpad’ instruction within the landing pad block.
  • A basic block that is not a landing pad block may not include a ‘landingpad’ instruction.

/jakob

However it seems that if a landingpad-block has multiple predecessors (often the case,
multiple InvokeInst in the main body of a try-statement all go to the same landingpad-
block), then you cannot move the LandingpadInst in order to break a critical edge unless
you do it for all landingpad-block predecessor edges simultaneously, but that seems
to be a messy conclusion (being forced to split other edges that don’t need to be split).

Yes, this is why we’re not going to have SplitCriticalEdge
succeed on landing pad edges, at least by default. It’s not conceptually
impossible, but it is significantly more invasive than a normal
edge-splitting, and pretty much every client would need to be
updated to handle it.

I think this will be done lazily to avoid excessive splitting as in:

Call1 → LP
Call2 → LP
Call3 → LP

=> split Call1

Call1 → LP-split → LP-remainder
Call2 → LP-split-merge → LP-remainder
Call3 → LP-split-merge → LP-remainder

Yes, this is precisely the transformation required for splitting a critical
edge to a landing pad.

If LP has n predecessors and k phis, such that looks like this:

LP:
%phi_i = phi [ %value_{i,1}, %call_1 ] … [ %value_{i,n}, %call_n ]
%lpad = landingpad LP-args
LP-instructions…

then the three new blocks look like this:

LP-split:
%lpad-split = landingpad LP-args
br label %LP-remainder

LP-split-merge:
%phi-split-merge_i = phi [ %value_{i,2}, %call_2 ] … [ %value_{i,n}, %call_n ]
%lpad-split-merge = landingpad LP-args
br label %LP-remainder

LP-remainder:
%phi_i = phi [ %value_{i,1}, %LP-split ], [ %phi-split-merge, %LP-split-merge ]
%lpad = phi [ %lpad-split, %LP-split ], [ %lpad-split-merge, %LP-split-merge ]
LP-instructions…

In practice, we would want to recognize that LP has the form of an
LP-split-merge, with associated LP-remainder, and we would just re-use
that as our new LP-split-merge block, adding incoming values to the phis
in the LP-remainder as appropriate.

John.