RFC: Exception Handling Proposal II

Hi everyone!

I've been silently working on the previous exception handling proposal I published last year (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-November/027311.html). However, there were some major deficiencies in it.

After discussing this further with Jim Grosbach and John McCall, we've come up with another proposal. As you can see, it incorporates the idea of "regions" into LLVM's IR. It also leaves the door open for implementing Chris's EH changes in his LLVMNotes. In fact, this can be looked upon as a first step towards that goal.

Please enjoy and let me know if you have any comments!

-bw

                           Exception Handling in LLVM

Problem

Hi Bill,

First, I like the idea of being more expressive in IR, since the old
exception handling style didn't quite express exception handling, but
alternate paths (resume/unwind). That made DwarfException.cpp a big
hack. :wink:

It also makes the front-end engineer's life a lot easier, since there
is less need to keep a long list of global state for all current
landing pads, clean up areas, terminate handlers, etc.

If I got it right, the dispatch instruction will tell the
instructions/calls to unwind to specific landing pads (cleanup areas,
terminate), but the region number will encode try/catch areas, so that
all those cleanup landing pads should ultimately end up in the catch
area for that region.

If that's so, how do you encode which which landing pad is to be
followed per region?

Consider the following code:

try {
  Foo f();
  f.run(); // can throw exception
  Bar b();
  b.run(); // can throw exception
  Baz z();
  z.run(); // can throw exception
} catch (...) {
}

The object 'f' is in a different cleanup area than 'b' which, in turn
is in a different area than 'z'. These three regions should point to
three different landing pads (or different offsets in the same landing
pad), which (I believe) are encoded in IR by being declared after
different dispatch instructions, all of which within the same region.

So, if the dispatch instruction points to a catch area, how do you
differentiate between clean-up areas? If they point to clean-up areas,
how to you specify a catch area per group, to go after cleaning?

My second question is, why did you keep the invoke?

As far as I got it, you have a dispatch instruction inside a basic
block, and whatever throws an exception there should unwind to the
dispatch area (if it matches the filters), in order of appearance,
ending in the "resume unwinding" path if none matches.

If that's so, why do you still have the invoke call? Why should you
treat call-exceptions any differently than instruction-exceptions?

Since this is a major refactoring of the IR (new back-ends won't
understand old IRs at all), there is no point of keeping
compatibility...

* The "catchall", "catches", and "filters" clauses are optional. If none are
specified, then the landing pad is implicitly a "cleanup."

* The <resumedest> basic block is the destination to unwind to if the type
thrown isn't matched to any of the choices.

* The "catches" clause is a list of types which the region can catch and the
destinations to jump to for each type.

* The "catchall" clause is the place to jump to if the exception type doesn't
match any of the types in the "catches" clause.

* The "region" value is an integer, similar to the "addrspace" value, and is
unique to each dispatch in a function. IR objects reference this value to
indicate that they belong to the same region.

* The "filters" clause lists the types of exceptions which may be thrown by the
region.

These are major enhancements over the current model, as the back-end
doesn't have to guess anything. Both C++ and Java have those concepts
explicit in the language and (AFAIK) that was somewhat lost in the
"encoding".

best,
--renato

If I got it right, the dispatch instruction will tell the
instructions/calls to unwind to specific landing pads (cleanup areas,
terminate), but the region number will encode try/catch areas, so that
all those cleanup landing pads should ultimately end up in the catch
area for that region.

Caveat: I'm speaking from what I remember of our discussions, which
is not necessarily what Bill is intending to propose; that said, I'm
pretty confident that the design hasn't significantly changed.

A dispatch instruction is part of a landing pad, not part of the normal
instruction stream. A dispatch is actually 1-1 with a specific landing
pad, and that pair of landing pad + dispatch instruction is basically
all a region is. So the term is a bit misleading because it suggests
that the landing pad is directly associated in the IR with a range of
instructions, whereas in fact the current design is orthogonal from
the question of how you actually reach a landing pad in the first place.

For now, that's still via explicit invokes; the invoke names the region
it unwinds to — Bill has it listing both the region number and the
landing pad block, which I think is redundant but harmless.

In my opinion, the most crucial property of the new design is that
it makes the chaining of regions explicit in the IR. The "resume"
edge from a dispatch instruction always leads to either another
region or to a bit of code which re-enters the unwinder in some
opaque way. When the inliner inlines a call in a protected region
(i.e. an invoke, for now), it just forwards the outermost resume
edges in the inlined function to the innermost region in the calling
function, potentially making the old code unreachable. Frontends
are responsible for emitting regions and associated resume code
for which this preserves semantics.

So every landing pad actually has a stack of regions which
CodeGen has to examine to write out the unwind tables, but
it's easy to figure out that stack just by chasing links.

While I'm at it, there's another important property of dispatch —
it's undefined behavior to leave the function between landing
at a landing pad and reaching the dispatch.

If that's so, how do you encode which which landing pad is to be
followed per region?

Consider the following code:

try {
Foo f();
f.run(); // can throw exception
Bar b();
b.run(); // can throw exception
Baz z();
z.run(); // can throw exception
} catch (...) {
}

I assume you don't mean these to be function declarations. :slight_smile:

The object 'f' is in a different cleanup area than 'b' which, in turn
is in a different area than 'z'. These three regions should point to
three different landing pads (or different offsets in the same landing
pad), which (I believe) are encoded in IR by being declared after
different dispatch instructions, all of which within the same region.

Nope. Three regions, three landing pads, three dispatch instructions.
(actually four if Foo::Foo() can throw). The Baz-destructing region
chains to the Bar-destructing region which chains to the Foo-destructing
region which chains to the catching region; the first three are
cleanup-only.

If that's so, why do you still have the invoke call? Why should you
treat call-exceptions any differently than instruction-exceptions?

One of my favorite things about this design is that it's totally
independent of what exactly is allowed to throw. I'm really not sure
how best to represent other throwing instructions, except that I'm
pretty confident that we don't want anything as heavyweight as
invoke. There's a pretty broad range of possibilities — we could
make invoke-like instructions for all of them (ick...), or we could
tag individual instructions with regions, or we could mark basic
blocks as unwinding to particular places. But we can wrestle
with that independently of deciding to adopt explicitly-chained
landing pads.

John.

Hi Bill,

First, I like the idea of being more expressive in IR, since the old
exception handling style didn't quite express exception handling, but
alternate paths (resume/unwind). That made DwarfException.cpp a big
hack. :wink:

It also makes the front-end engineer's life a lot easier, since there
is less need to keep a long list of global state for all current
landing pads, clean up areas, terminate handlers, etc.

That's a big goal of this rewrite. :slight_smile:

If I got it right, the dispatch instruction will tell the
instructions/calls to unwind to specific landing pads (cleanup areas,
terminate), but the region number will encode try/catch areas, so that
all those cleanup landing pads should ultimately end up in the catch
area for that region.

Yes, that's correct.

If that's so, how do you encode which which landing pad is to be
followed per region?

Consider the following code:

try {
Foo f();
f.run(); // can throw exception
Bar b();
b.run(); // can throw exception
Baz z();
z.run(); // can throw exception
} catch (...) {
}

The object 'f' is in a different cleanup area than 'b' which, in turn
is in a different area than 'z'. These three regions should point to
three different landing pads (or different offsets in the same landing
pad), which (I believe) are encoded in IR by being declared after
different dispatch instructions, all of which within the same region.

So, if the dispatch instruction points to a catch area, how do you
differentiate between clean-up areas? If they point to clean-up areas,
how to you specify a catch area per group, to go after cleaning?

There are two different bits of information in the proposal that address this: the "unwind edge" from the invoke and the region number. It's the "unwind edge" which carries the bit of information that you're asking about. My mental view of this is that the region number correlates to the dispatch instruction (and, therefore, the catch blocks) and the unwind edge correlates to the cleanups. (This isn't exactly correct, but it helps to understand how all of this information will be used to generate the correct tables, etc.)

So the above code would look something like this:

entry:
  invoke @f.run() to %bb unwind to %FooCleanup region(1)
bb:
  invoke @b.run() to %bb2 unwind to %BarCleanup region(1)
bb2:
  invoke @z.run() to %bb3 unwind to %BazCleanup region(1)
...
BazCleanup:
  ; do stuff
  br %BarCleanup
BarCleanup:
  ; do stuff
  br %FooCleanup
FooCleanup:
  ; do stuff
  br %Dispatch
Dispatch:
  dispatch region(1) resume %block
    catches [ ... ]

and so on.

If we remove the invoke instructions, the "unwind to" information then goes onto the basic block itself. E.g.:

bb: unwind to %FooCleanup
call @f1.run()
call @f2.run()
call @f3.run()
call @f4.run()

My second question is, why did you keep the invoke?

As far as I got it, you have a dispatch instruction inside a basic
block, and whatever throws an exception there should unwind to the
dispatch area (if it matches the filters), in order of appearance,
ending in the "resume unwinding" path if none matches.

If that's so, why do you still have the invoke call? Why should you
treat call-exceptions any differently than instruction-exceptions?

Since this is a major refactoring of the IR (new back-ends won't
understand old IRs at all), there is no point of keeping
compatibility...

I kept it purely from a practical standpoint: it will be easier (and thus quicker) to do it this way. :slight_smile: Also, it could be done a bit more incrementally than completely removing the invoke. But I'm not opposed to removing the invoke.

* The "catchall", "catches", and "filters" clauses are optional. If none are
specified, then the landing pad is implicitly a "cleanup."

* The <resumedest> basic block is the destination to unwind to if the type
thrown isn't matched to any of the choices.

* The "catches" clause is a list of types which the region can catch and the
destinations to jump to for each type.

* The "catchall" clause is the place to jump to if the exception type doesn't
match any of the types in the "catches" clause.

* The "region" value is an integer, similar to the "addrspace" value, and is
unique to each dispatch in a function. IR objects reference this value to
indicate that they belong to the same region.

* The "filters" clause lists the types of exceptions which may be thrown by the
region.

These are major enhancements over the current model, as the back-end
doesn't have to guess anything. Both C++ and Java have those concepts
explicit in the language and (AFAIK) that was somewhat lost in the
"encoding".

Thanks, Renato! :slight_smile:

-bw

If that's so, how do you encode which which landing pad is to be
followed per region?

Consider the following code:

try {
Foo f();
f.run(); // can throw exception
Bar b();
b.run(); // can throw exception
Baz z();
z.run(); // can throw exception
} catch (...) {
}

I assume you don't mean these to be function declarations. :slight_smile:

The object 'f' is in a different cleanup area than 'b' which, in turn
is in a different area than 'z'. These three regions should point to
three different landing pads (or different offsets in the same landing
pad), which (I believe) are encoded in IR by being declared after
different dispatch instructions, all of which within the same region.

Nope. Three regions, three landing pads, three dispatch instructions.
(actually four if Foo::Foo() can throw). The Baz-destructing region
chains to the Bar-destructing region which chains to the Foo-destructing
region which chains to the catching region; the first three are
cleanup-only.

Ah ha! I think I had a different mental model than you did. Or at least I remembered things differently from the discussion. :slight_smile: For me, there is one dispatch per region, which is why I had the region number associated with the invokes as well as the "unwind to" edge coming from them. (See my response to Renato for a description.) I'll think more about your model...

If that's so, why do you still have the invoke call? Why should you
treat call-exceptions any differently than instruction-exceptions?

One of my favorite things about this design is that it's totally
independent of what exactly is allowed to throw. I'm really not sure
how best to represent other throwing instructions, except that I'm
pretty confident that we don't want anything as heavyweight as
invoke. There's a pretty broad range of possibilities — we could
make invoke-like instructions for all of them (ick...), or we could
tag individual instructions with regions, or we could mark basic
blocks as unwinding to particular places. But we can wrestle
with that independently of deciding to adopt explicitly-chained
landing pads.

Exactly! :slight_smile:

-bw

There are two different bits of information in the proposal that address this: the "unwind edge" from the invoke and the region number. It's the "unwind edge" which carries the bit of information that you're asking about. My mental view of this is that the region number correlates to the dispatch instruction (and, therefore, the catch blocks) and the unwind edge correlates to the cleanups. (This isn't exactly correct, but it helps to understand how all of this information will be used to generate the correct tables, etc.)

Hi Bill,

That's exactly how I pictured, glad I got it right... :wink:

So the above code would look something like this:

entry:
invoke @f.run() to %bb unwind to %FooCleanup region(1)
bb:
invoke @b.run() to %bb2 unwind to %BarCleanup region(1)
bb2:
invoke @z.run() to %bb3 unwind to %BazCleanup region(1)
...
BazCleanup:
; do stuff
br %BarCleanup
BarCleanup:
; do stuff
br %FooCleanup
FooCleanup:
; do stuff
br %Dispatch
Dispatch:
dispatch region(1) resume %block
catches [ ... ]

and so on.

I see. The front-end is responsible for knowing that baz, bar and foo
cleanup area are chained together.

So, if I got it right, the dispatch area is common for the whole
function, all final cleanup areas point to the dispatch, and the
back-end knows which dispatch "instruction" to call based on the
region number. The dispatch instructions then would encode what type
of dispatch they were just based on the filters, catches, etc.

If we remove the invoke instructions, the "unwind to" information then goes onto the basic block itself. E.g.:

bb: unwind to %FooCleanup
call @f1.run()
call @f2.run()
call @f3.run()
call @f4.run()

In the general case, that simplifies a lot.

But in this case, if you were constructing the objects as you call
them (as previous example), you'd have to tell the call to which
cleanup landing pad you have to go.

One option is to separate in basic blocks with different cleanup landing pads:

bb: unwind to %FooCleanup
call @f1.run()
br bb1
bb1: unwind to %BarCleanup
call @f2.run()
br bb2
bb1: unwind to %BazCleanup
call @f3.run()
%0 = div i32 7 i32 0 ; this throws in Java and goes to BazCleanup
br bb3
bb1: unwind to %Dispatch ; no cleanup
call @f4.run()
br bb4

I kept it purely from a practical standpoint: it will be easier (and thus quicker) to do it this way. :slight_smile: Also, it could be done a bit more incrementally than completely removing the invoke. But I'm not opposed to removing the invoke.

I hadn't quite got the role of the invoke in your intermediary model,
now I get it. :wink:

I think we should ultimately treat calls like instructions and not
mark them in any specific way regarding EH.

cheers,
--renato

Having a central dispatch simplifies a bit the front-end (less global
chain info) and the region number already encodes to which you're
going to call when building the EH table, so I guess having multiple
dispatch blocks is redundant.

Does it make sense?

cheers,
--renato

Hmm. The difference between our models is actually in what we're calling a region. In your model, adding a cleanup doesn't require a new region; you just create a new landing pad which does the cleanup and then branches to (I guess) the landing pad of its containing cleanup. So landing pads are many-to-one with regions and dispatch instructions. In my model, every independent landing pad is necessarily a region. So in my model, there is also one dispatch per region, but there are more regions.

So, in this example:
  A { A(); ~A() throw(); };
  void foo() throw() {
    A x;
    b();
  }

Your model has this as follows, modulo syntax:
  entry:
    %x = alloca %A
    invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
  succ0:
    invoke void @b() to label %succ1 unwind label %lp1 region 0
  succ1:
    call void @A::~A(%A* %x) nounwind
    ret void
  lp0:
    call void @A::~A(%A* %x) nounwind
    br label %lp1
  lp1:
    dispatch region 0 [filter i8* null] # no resume edge

Whereas my model has this as follows:
  entry:
    %x = alloca %A
    invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
  succ0:
    invoke void @b() to label %succ1 unwind label %lp1 region 1
  succ1:
    call void @A::~A(%A* %x) nounwind
    ret void
  lp0:
    call void @A::~A(%A* %x) nounwind
    dispatch region 0 [], resume label %lp1
  lp1:
    dispatch region 1 [filter i8* null] # no resume edge

I think my model has some nice conceptual advantages; for one, it gives you the constraint that only EH edges and dispatch instructions can lead to landing pads, which I think will simplify what EH preparation has to do. But I could be convinced.

John.

So the above code would look something like this:

entry:
invoke @f.run() to %bb unwind to %FooCleanup region(1)
bb:
invoke @b.run() to %bb2 unwind to %BarCleanup region(1)
bb2:
invoke @z.run() to %bb3 unwind to %BazCleanup region(1)
...
BazCleanup:
; do stuff
br %BarCleanup
BarCleanup:
; do stuff
br %FooCleanup
FooCleanup:
; do stuff
br %Dispatch
Dispatch:
dispatch region(1) resume %block
   catches [ ... ]

and so on.

I see. The front-end is responsible for knowing that baz, bar and foo
cleanup area are chained together.

So, if I got it right, the dispatch area is common for the whole
function, all final cleanup areas point to the dispatch, and the
back-end knows which dispatch "instruction" to call based on the
region number. The dispatch instructions then would encode what type
of dispatch they were just based on the filters, catches, etc.

That's correct. With the caveat that there can be more than one dispatch area per function and some may be nested.

If we remove the invoke instructions, the "unwind to" information then goes onto the basic block itself. E.g.:

bb: unwind to %FooCleanup
call @f1.run()
call @f2.run()
call @f3.run()
call @f4.run()

In the general case, that simplifies a lot.

But in this case, if you were constructing the objects as you call
them (as previous example), you'd have to tell the call to which
cleanup landing pad you have to go.

One option is to separate in basic blocks with different cleanup landing pads:

bb: unwind to %FooCleanup
call @f1.run()
br bb1
bb1: unwind to %BarCleanup
call @f2.run()
br bb2
bb1: unwind to %BazCleanup
call @f3.run()
%0 = div i32 7 i32 0 ; this throws in Java and goes to BazCleanup
br bb3
bb1: unwind to %Dispatch ; no cleanup
call @f4.run()
br bb4

Yeah, that's the basic idea. As John mentioned, this is an orthogonal change that we need to work out in the future. Chris had the above idea in the notes he wrote. But we will need to determine if it's sufficient or if there's a better solution.

I kept it purely from a practical standpoint: it will be easier (and thus quicker) to do it this way. :slight_smile: Also, it could be done a bit more incrementally than completely removing the invoke. But I'm not opposed to removing the invoke.

I hadn't quite got the role of the invoke in your intermediary model,
now I get it. :wink:

I think we should ultimately treat calls like instructions and not
mark them in any specific way regarding EH.

I totally agree. :slight_smile:

-bw

Notice though that we would need to keep both the EH edge from the invoke and the region numbers, which you said was redundant (in your first email). :slight_smile: Consider if there were several cleanup landing pads leading in a chain, through successive dispatches, down to the last dispatch that decides which catch to execute:

lp0:
   call void @A::~A(%A* %x)
   dispatch region 0 [], resume label %lp1
lp1:
   call void @B::~B(%B* %y)
   dispatch region 1 [], resume label %lp2
lp2:
   call void @C::~C(%C* %z)
   dispatch region 2 [], resume label %lp3
lp3:
   dispatch region 3 [filter i8* null] # no resume edge

If we have inlining of any of those d'tor calls, we may lose the fact that the dispatch in, say, lp1 is a cleanup that lands onto the region 2 dispatch in lp2. We know from experience that once this information is lost, it's *really* hard to get it back again. That's what DwarfEHPrepare.cpp is all about, and I want to get rid of that pass because it's a series of hacks to get around our poor EH support.

On a personal level, I'm not a big fan of forcing constraints on code when they aren't needed. We had problems in the past (with inlining into a cleanup part of an EH region) with enforcing the invariant that only unwind edges may jump to a landing pad. If we go back to the example above, if @C::~C() were inlined, it could come to pass that the dispatch is placed into a separate basic block and that the inlined code branches into that new basic block thus violating the constraint.

-bw

Ah ha! I think I had a different mental model than you did. Or at least I remembered things differently from the discussion. :slight_smile: For me, there is one dispatch per region, which is why I had the region number associated with the invokes as well as the "unwind to" edge coming from them. (See my response to Renato for a description.) I'll think more about your model...

Hmm. The difference between our models is actually in what we're calling a region. In your model, adding a cleanup doesn't require a new region; you just create a new landing pad which does the cleanup and then branches to (I guess) the landing pad of its containing cleanup. So landing pads are many-to-one with regions and dispatch instructions. In my model, every independent landing pad is necessarily a region. So in my model, there is also one dispatch per region, but there are more regions.

So, in this example:
A { A(); ~A() throw(); };
void foo() throw() {
  A x;
  b();
}

Your model has this as follows, modulo syntax:
entry:
  %x = alloca %A
  invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
succ0:
  invoke void @b() to label %succ1 unwind label %lp1 region 0
succ1:
  call void @A::~A(%A* %x) nounwind
  ret void
lp0:
  call void @A::~A(%A* %x) nounwind
  br label %lp1
lp1:
  dispatch region 0 [filter i8* null] # no resume edge

Whereas my model has this as follows:
entry:
  %x = alloca %A
  invoke void @A::A(%A* %x) to label %succ0 unwind label %lp0 region 0
succ0:
  invoke void @b() to label %succ1 unwind label %lp1 region 1
succ1:
  call void @A::~A(%A* %x) nounwind
  ret void
lp0:
  call void @A::~A(%A* %x) nounwind
  dispatch region 0 [], resume label %lp1
lp1:
  dispatch region 1 [filter i8* null] # no resume edge

I think my model has some nice conceptual advantages; for one, it gives you the constraint that only EH edges and dispatch instructions can lead to landing pads, which I think will simplify what EH preparation has to do. But I could be convinced.

Notice though that we would need to keep both the EH edge from the invoke and the region numbers, which you said was redundant (in your first email). :slight_smile: Consider if there were several cleanup landing pads leading in a chain, through successive dispatches, down to the last dispatch that decides which catch to execute:

lp0:
  call void @A::~A(%A* %x)
  dispatch region 0 [], resume label %lp1
lp1:
  call void @B::~B(%B* %y)
  dispatch region 1 [], resume label %lp2
lp2:
  call void @C::~C(%C* %z)
  dispatch region 2 [], resume label %lp3
lp3:
  dispatch region 3 [filter i8* null] # no resume edge

If we have inlining of any of those d'tor calls, we may lose the fact that the dispatch in, say, lp1 is a cleanup that lands onto the region 2 dispatch in lp2.

What you mean is that, given a resume or invoke edge, we need to be able to find the dispatch for the target region. There are ways to make that happen without tagged edges; for example, you could make the landing pad a special subclass of BasicBlock with a pointer to the dispatch, although that'd be a fairly invasive change. Tagging the edges solves the problem for clients with a handle on an edge; clients that want to go from (say) a dispatch to its landing pad(s) will still have trouble.

We know from experience that once this information is lost, it's *really* hard to get it back again. That's what DwarfEHPrepare.cpp is all about, and I want to get rid of that pass because it's a series of hacks to get around our poor EH support.

While I agree that the hacks need to go, there is always going to be some amount of custom codegen for EH just to get the special data to flow properly from landing pads to the eh.exception intrinsic and the dispatch instruction. My design would give you some very powerful assumptions to work with to implement that: both the intrinsic calls and the dispatch would always be dominated by the region's landing pad, which would in turn only be reachable via specific edges. I don't know how you're planning on implementing this without those assumptions, but if you say you don't need them, that certainly diminishes the appeal of my proposal.

On a personal level, I'm not a big fan of forcing constraints on code when they aren't needed. We had problems in the past (with inlining into a cleanup part of an EH region) with enforcing the invariant that only unwind edges may jump to a landing pad. If we go back to the example above, if @C::~C() were inlined, it could come to pass that the dispatch is placed into a separate basic block and that the inlined code branches into that new basic block thus violating the constraint.

I'm not suggesting that the landing pad has to be the same block as the block with the dispatch instruction. That's an obviously unreasonable constraint; in fact, it wouldn't hold for the vast majority of C++ cleanups, since we generally have to invoke destructors so we can terminate if they throw. Since — unlike present-day branches between cleanups — dispatch edges will be initially opaque to the optimizer, I don't see much danger of it producing invalid code from otherwise-valid transformations.

John.

I think my model has some nice conceptual advantages; for one, it gives you the constraint that only EH edges and dispatch instructions can lead to landing pads, which I think will simplify what EH preparation has to do. But I could be convinced.

Notice though that we would need to keep both the EH edge from the invoke and the region numbers, which you said was redundant (in your first email). :slight_smile: Consider if there were several cleanup landing pads leading in a chain, through successive dispatches, down to the last dispatch that decides which catch to execute:

lp0:
call void @A::~A(%A* %x)
dispatch region 0 [], resume label %lp1
lp1:
call void @B::~B(%B* %y)
dispatch region 1 [], resume label %lp2
lp2:
call void @C::~C(%C* %z)
dispatch region 2 [], resume label %lp3
lp3:
dispatch region 3 [filter i8* null] # no resume edge

If we have inlining of any of those d'tor calls, we may lose the fact that the dispatch in, say, lp1 is a cleanup that lands onto the region 2 dispatch in lp2.

What you mean is that, given a resume or invoke edge, we need to be able to find the dispatch for the target region. There are ways to make that happen without tagged edges; for example, you could make the landing pad a special subclass of BasicBlock with a pointer to the dispatch, although that'd be a fairly invasive change. Tagging the edges solves the problem for clients with a handle on an edge; clients that want to go from (say) a dispatch to its landing pad(s) will still have trouble.

It's not that troublesome. The dispatch would give you the region number. All objects in the function with that region number will point to the landing pad(s) for that region.

We know from experience that once this information is lost, it's *really* hard to get it back again. That's what DwarfEHPrepare.cpp is all about, and I want to get rid of that pass because it's a series of hacks to get around our poor EH support.

While I agree that the hacks need to go, there is always going to be some amount of custom codegen for EH just to get the special data to flow properly from landing pads to the eh.exception intrinsic and the dispatch instruction. My design would give you some very powerful assumptions to work with to implement that: both the intrinsic calls and the dispatch would always be dominated by the region's landing pad, which would in turn only be reachable via specific edges. I don't know how you're planning on implementing this without those assumptions, but if you say you don't need them, that certainly diminishes the appeal of my proposal.

We actually have the "reachable via specific edges" check in our code right now. But when we tried to allow exceptions to be marked as proper "cleanups", the assumption was violated. So I'm wary of making this same assumption twice.

But anyway, I think that I can gather the necessary information from the region numbers and the invokes' "unwind to" edges to create the EH tables. The only intrinsic call that should remain is the one that gets the EH pointer. And that's only needed by the catch blocks.

Perhaps I'm missing something? :slight_smile:

On a personal level, I'm not a big fan of forcing constraints on code when they aren't needed. We had problems in the past (with inlining into a cleanup part of an EH region) with enforcing the invariant that only unwind edges may jump to a landing pad. If we go back to the example above, if @C::~C() were inlined, it could come to pass that the dispatch is placed into a separate basic block and that the inlined code branches into that new basic block thus violating the constraint.

I'm not suggesting that the landing pad has to be the same block as the block with the dispatch instruction. That's an obviously unreasonable constraint; in fact, it wouldn't hold for the vast majority of C++ cleanups, since we generally have to invoke destructors so we can terminate if they throw. Since — unlike present-day branches between cleanups — dispatch edges will be initially opaque to the optimizer, I don't see much danger of it producing invalid code from otherwise-valid transformations.

Okay, I misread. But then we're back to a disconnect of information. If you have lp0 jumping into lp1, how does it know that that's the dispatch for region 1? We would have to implement the BasicBlock subclassing that you mentioned above, because it's yet another piece of information that needs to be tightly coupled between instructions. The subclassing of BasicBlock has some issues which need to be thought out more. In particular, it requires augmenting the IR with basic block attributes, defining semantics on those, supporting them throughout the compiler, etc. It's a large change in and of itself, and would be no longer orthogonal to the EH changes.

-bw

What you mean is that, given a resume or invoke edge, we need to be able to find the dispatch for the target region. There are ways to make that happen without tagged edges; for example, you could make the landing pad a special subclass of BasicBlock with a pointer to the dispatch, although that'd be a fairly invasive change. Tagging the edges solves the problem for clients with a handle on an edge; clients that want to go from (say) a dispatch to its landing pad(s) will still have trouble.

It's not that troublesome. The dispatch would give you the region number. All objects in the function with that region number will point to the landing pad(s) for that region.

Well, so a linear scan of the function seems like trouble to me, but I shouldn't worry too much about optimizing a theoretical client. :slight_smile:

Anyway, I'd like to table this part of the discussion while we resolve the other half. We're arguing about two things:
1. Whether edges leading to landing pads should also be tagged with the region. This is, well, more important than a bike shed, but still insignificant relative to:
2. Whether a single region can have multiple landing pads. This is actually a pretty central question in the design.

Plus the answer to #2 may obviate the need for discussion on #1 anyway.

We know from experience that once this information is lost, it's *really* hard to get it back again. That's what DwarfEHPrepare.cpp is all about, and I want to get rid of that pass because it's a series of hacks to get around our poor EH support.

While I agree that the hacks need to go, there is always going to be some amount of custom codegen for EH just to get the special data to flow properly from landing pads to the eh.exception intrinsic and the dispatch instruction. My design would give you some very powerful assumptions to work with to implement that: both the intrinsic calls and the dispatch would always be dominated by the region's landing pad, which would in turn only be reachable via specific edges. I don't know how you're planning on implementing this without those assumptions, but if you say you don't need them, that certainly diminishes the appeal of my proposal.

We actually have the "reachable via specific edges" check in our code right now. But when we tried to allow exceptions to be marked as proper "cleanups", the assumption was violated. So I'm wary of making this same assumption twice.

But anyway, I think that I can gather the necessary information from the region numbers and the invokes' "unwind to" edges to create the EH tables. The only intrinsic call that should remain is the one that gets the EH pointer. And that's only needed by the catch blocks.

Right, I agree that it's easy to assemble the EH tables for a given invoke under any of these variants. We don't need any new constraints to make this easier.

What I'm talking about is the flow of data from the start of the landing pad to two points:
1. The (optional) call to @llvm.eh.exception.
2. The dispatch instruction.
Specifically, in DWARF EH, the personality function writes these two values into the frame somewhere — maybe into registers, maybe into the EH buffer, whatever — and that information needs to flow to the appropriate intrinsic/instruction. You can't just stash it aside, because the cleanup might throw and catch an exception between points A and B. I'm really not sure how this is supposed to work if there's no guaranteed relationship between the landing pad and the intrinsic/dispatch (†). The most sensible constraint — dominance — is equivalent to saying that each region has at most one landing pad.

(†) Technically, there can be no true guarantee because the dispatch doesn't even need to be reachable from its landing pad. For example:
  extern void foo();
  struct A { ~A() { throw 0; } };
  void test() { A a; foo(); }
After inlining, the cleanup landing pad in test() will contain a throw to a terminating handler, and therefore the dispatch will not be reachable. So any constraint has to be something like "either the dispatch is unreachable or it's dominated by the entry to the landing pad". But it's actually quite easy to write correct code to handle this case. :slight_smile:

John.

I hadn't quite got the role of the invoke in your intermediary model,
now I get it. :wink:

I think we should ultimately treat calls like instructions and not
mark them in any specific way regarding EH.

I totally agree. :slight_smile:

I think everyone wants to get rid of invoke, but that is hard. One problem
is that you want to keep the SSA property "definitions dominate uses". Now
suppose you have a basic block

   bb: [when throws, branch to XYZ]
      ...
      %x = ... (define %x)
      ...

   XYZ:
      ...use %x...

If you got to XYZ because an instruction threw an exception before %x was
defined, then in XYZ you are using %x which was never defined. In effect
the definition of %x in bb does not dominate the use in XYZ. I think the
solution is to say that in XYZ you are not allowed to use any values defined
in bb: in the dominator tree, bb is not considered to dominate XYZ.

These kind of issues touch fundamental design points of LLVM, so need to be
dealt with carefully.

Ciao,

Duncan.

Doesn't "the dispatch is unreachable" already imply that it's
dominated by the landing pad? All paths to the dispatch (i.e. none at
all) will first go through the landing pad. So that constraint is just
"the dispatch is dominated by the entry to the landing pad".

Hi Duncan,

I don't see how you can have dominance between a normal block and a
cleanup block. Clean-up landing pads should never use user code (since
they don't exist in userland).

Catch landing pads, on the other hand, have the same dominance
relationship that the rest of user code has (and the same problems).

Since you should never branch to XYZ under normal circumstances, you
should never rely on its predecessor's values anyway. That's the whole
point of having @llvm.eh.exception and @llvm.eh.selector, as it's the
role of the personality routine to pass information between the user
code and unwinding code.

In essence, in compiler generated landing pads, you should never
generate a use of user values. But if XYZ is user code, it's user
problem. :wink:

cheers,
--renato

PS: abnormal cases like throwing on a destructor when previously
thrown inside a constructor leads to termination, so even if you "use"
the value in the catch area, you won't get there anyway. :wink:

I did a sample test with gcc and it is stashing the information off to the side.

#include <cstdio>

void foo();
void qux();

struct A {
  ~A() { try { throw 0; } catch (double) { printf("hoo\n"); } }
};

void bar() {
  try {
    A a;
    foo();
  } catch (const char *c) {
    printf("hello world! %s\n", c);
  }
}

The cleanup part for the "foo" call has this:

L24:
        # basic block 4
L3:
        movl %edx, %ebx
        movq %rax, %r13
        movl $4, %edi
        call ___cxa_allocate_exception
        movq %rax, %rdi
        movl $0, (%rax)
        movq __ZTIi@GOTPCREL(%rip), %rsi
        xorl %edx, %edx
LEHB2:
        call ___cxa_throw
LEHE2:

edx and rax hold the EH selector value and EH object pointer respectively. They get restored before continuing on to the catch in bar (if they get there, which I don't think they will in this case).

But that's beside the point. I see what you mean. There is an impedance mismatch going on here. :slight_smile: Your use of dominance as a constraint was confusing me. I characterize it as losing information, but it results in the same problem. I.e., after a throw and catch in a cleanup, we need to be able to reconstruct the original EH value and object pointer, but cannot determine that if we are simply branching to yet another block from that catch. Using the dispatch instruction gives us this information.

The proposal can be easily modified to the "landing pad dominates the dispatch" view of the world. And if we're careful, we can keep the guarantee that only unwind and dispatch edges may jump to a landing pad throughout the optimizer. :slight_smile:

Thanks for shedding light on this!

-bw

Hi John,

I think this is inevitable. To be able to represent:

bb1: unwinds to %cleanup

one has to change the behaviour of basic blocks, and the best way to
do that is to create a sub-class for that special case.

That also must be used by every optimization that analyses the call
graph, dominance, inlining, etc.

cheers,
--renato

I agree this is a reasonable and necessary assumption to make it safe. :wink:

cheers,
--renato

Hi Renato,

If you got to XYZ because an instruction threw an exception before %x was
defined, then in XYZ you are using %x which was never defined. In effect
the definition of %x in bb does not dominate the use in XYZ. I think the
solution is to say that in XYZ you are not allowed to use any values defined
in bb: in the dominator tree, bb is not considered to dominate XYZ.

Hi Duncan,

I don't see how you can have dominance between a normal block and a
cleanup block. Clean-up landing pads should never use user code (since
they don't exist in userland).

I don't understand what you are saying. Cleanups (e.g. destructors)
can execute arbitrary user code, access arbitrary local variables etc.
For example, you can pass the address of a local variable to a class
which reads that value of that variable in a destructor etc. Note also
that LLVM is not just used by C++, it is also used by Ada which makes huge
(and subtle) use of exception handling, and doesn't always work the same as
C++. For example, throwing an exception in a destructor does not terminate
a program in Ada.

Catch landing pads, on the other hand, have the same dominance
relationship that the rest of user code has (and the same problems).

Since you should never branch to XYZ under normal circumstances, you
should never rely on its predecessor's values anyway. That's the whole
point of having @llvm.eh.exception and @llvm.eh.selector, as it's the
role of the personality routine to pass information between the user
code and unwinding code.

I don't get what you are talking about here. You can access any variables
you like, whether local or global, in a catch handler. They are not passed
to the handler via llvm.eh.exception or llvm.eh.selector, they are simply
accessed directly (the unwinder restores registers etc making this possible).
Anyway, I'm not talking about what users should or shouldn't do, I'm talking
about fundamental rules for LLVM IR like "definitions must dominate uses".
What does this rule mean exactly and why does it exist? It is actually
fundamental to SSA form and is what makes the whole thing work. For example,
beginners to LLVM often ask how to get the RHS in "%x = icmp i32 %a, %b". Of
course there is no right-hand side because in SSA form the value of %x cannot
change and (this is the important bit for this discussion) %x *is never used
before it is defined*. Thus there is no point in distinguishing between %x
and the RHS, %x *is* the RHS. I'm pointing out that if the invoke instruction
is removed and catch information is attached to entire basic blocks, then if no
care is taken then it is perfectly possible to use %x before it is defined as
explained in my previous email, blowing up the entire LLVM system. Clearly the
solution is to not allow this by not allowing values defined in a basic block
to be used in a handler for that block; this in turn means that basic blocks
cannot be considered to dominate their handlers even if the only way to get
to the handler is via that basic block; this in turn means that all kinds of
transforms that much around with basic blocks (eg: SimplifyCFG) need to be
audited to make sure they don't break the new rule. And so on.

In essence, in compiler generated landing pads, you should never
generate a use of user values. But if XYZ is user code, it's user
problem. :wink:

The compiler crashing is a compiler problem, and that's exactly what is going
to happen if care is not taken about such details as dominance.

cheers,
--renato

PS: abnormal cases like throwing on a destructor when previously
thrown inside a constructor leads to termination, so even if you "use"
the value in the catch area, you won't get there anyway. :wink:

In Ada you can throw and exception inside a destructor and it does not lead
to program termination.

Ciao,

Duncan.