Darwin vs exceptions

(Mail system seems to have eaten this, sorry if it's a repeat)

Hi Dale,

- Why was C++ claiming that every selector has a catch-all handler?

this is easy: because the semantics of invoke require it. Yes, really.
If unwinding reaches an invoke then control is required to jump to the
unwind basic block. At first I thought this probably wouldn't matter -
that it would be OK to not jump to the landing pad if the exception was
not being caught by it - and didn't implement it, but several eh failures
in the LLVM testsuite could be tracked down to it: the optimizers really
do exploit this property of invoke - it is quite subtle. You typically see
it when code is massively inlined into one big flat main function. Then I
tried to implement it by pushing a "cleanup" at the end of the exception
list. But the unwinder treats cleanups specially, and ignores them if during
unwinding it only sees cleanups all the way up to the top - in short there
were testsuite failures with this approach. So the only thing to do was
to push a catch-all on to the end of the list.

OK, playing around with the testsuite it appears there's a bug in llvm's inliner with EH, which is probably what's causing the effect you're talking about. Suppose we have

#include <cstdio>
class A {
public:
   A() {}
   ~A() {}
};
void f() {
   A a;
   throw 5.0;
}
main() {
   try {
     f();
    } catch(...) { printf("caught\n"); }
}

The IR for f correctly has the throw call reaching the landing pad, which cleans up 'a' and then calls Unwind_Resume. Inlining g into f naively copies this structure which is wrong; we do not want to call Unwind_Resume in this case because there is a real handler in the same function. See the code produced by gcc -O3. I think what you did is make this incorrect Unwind_Resume work (on your OS), but that's not the right way to fix this.

Hi Dale,

OK, playing around with the testsuite it appears there's a bug in
llvm's inliner with EH, which is probably what's causing the effect
you're talking about. Suppose we have

#include <cstdio>
class A {
public:
   A() {}
   ~A() {}
};
void f() {
   A a;
   throw 5.0;
}
main() {
   try {
     f();
    } catch(...) { printf("caught\n"); }
}

The IR for f correctly has the throw call reaching the landing pad,
which cleans up 'a' and then calls Unwind_Resume. Inlining g into f
naively copies this structure which is wrong; we do not want to call
Unwind_Resume in this case because there is a real handler in the
same function. See the code produced by gcc -O3. I think what you
did is make this incorrect Unwind_Resume work (on your OS), but
that's not the right way to fix this.

I don't see why you consider it a bug. It seems perfectly correct
to me, only suboptimal. For those watching, the issue is that after
inlining by LLVM you get this

        invoke i32 (...)* @_Unwind_Resume( i8* %eh_ptr.i )
                        to label %UnifiedUnreachableBlock unwind label %lpad ; <i32>:0 [#uses=0]

where this invoke is rather silly. But I don't see that it is wrong.
By the way, the gcc inliner knows about _Unwind_Resume so you don't see
this if gcc did the inlining.

Ciao,

Duncan.

Hi Dale,

#include <cstdio>
class A {
public:
   A() {}
   ~A() {}
};
void f() {
   A a;
   throw 5.0;
}
main() {
   try {
     f();
    } catch(...) { printf("caught\n"); }
}

this example indeed shows the problem. Let me explain to see if we agree on what
the problem is. Suppose we don't artificially add catch-alls to selectors. Then
the above example compiles to:

define void @_Z1fv() {
...
        invoke void @__cxa_throw( something ) noreturn
                        to label %somewhere unwind label %lpad
...
lpad:
        %eh_ptr = tail call i8* @llvm.eh.exception( )
        %eh_select8 = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*))
        tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr )
        unreachable
...
}

define i32 @main() {
entry:
        invoke void @_Z1fv( )
                        to label %somewhere2 unwind label %lpad2
...
lpad2: ; preds = %entry
        %eh_ptr = tail call i8* @llvm.eh.exception( )
        %eh_select14 = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i8* null )
  print_a_message_and_exit
...
}

And this works fine: main calls _Z1fv which throws an exception. Execution branches to lpad where
(empty) cleanup code is run, then unwinding is resumed. The unwinder unwinds into main, and branches
to lpad2 (because the selector has a catch-all, the null) which prints a message and exits.

If the inliner is run, then we get:

define i32 @main() {
...
        invoke void @__cxa_throw( something ) noreturn
                        to label %somewhere unwind label %lpad.i
...
lpad.i:
        %eh_ptr.i = tail call i8* @llvm.eh.exception( ) ; <i8*> [#uses=2]
        %eh_select8.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*))
        invoke i32 (...)* @_Unwind_Resume( i8* %eh_ptr.i )
                        to label %somewhere2 unwind label %lpad2
...
lpad2: ; preds = %lpad.i
        %eh_ptr = tail call i8* @llvm.eh.exception( ) ; <i8*> [#uses=2]
        %eh_select14 = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i8* null )
  print_a_message_and_exit
...
}

This is perfectly correct given LLVM invoke semantics. Unfortunately the unwinder doesn't
know about those :slight_smile: When run, the exception is thrown but the unwinder doesn't branch
to lpad.i because the selector doesn't state that that (or any) exception should be caught. Thus
the program is terminated. If you force a "cleanup" by changing the selector call to:
        %eh_select8.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i32 0)
then it doesn't work either: the unwinder observes that there is only a cleanup, and
using some special logic (bogus in this case) deduces that the exception will be rethrown
after running the cleanup code (and thus the program terminated), so doesn't bother running
the cleanup code and directly terminates the program (apparently terminating programs quickly
was important to whoever wrote the unwinder, I don't know why; the Ada unwinder doesn't do
this for example :slight_smile: ). However if you add a catch-all to the selector instead:
        %eh_select8.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*), i8* null)
then the unwinder does branch to lpad.i. Then the _Unwind_Resume call causes a branch to
lpad2 and all works perfectly.

This is why I forcably push a catch-all at the end of each selector call: because then if an
exception unwinds through an invoke, control always branches to the landing pad, which is what
LLVM invoke semantics require and the inliner has exploited.

Unfortunately it seems this breaks the Darwin unwinder.

It is true that you can imagine a solution in which the inliner knows about selector calls
and shuffles them around. I will think about this. That said, invoke is defined to have
certain semantics, and I don't much like the idea of trying to do an end-run around them
and around optimizers that exploit them...

Ciao,

Duncan.

Hi Dale,

#include

class A {

public:

A() {}

~A() {}

};

void f() {

A a;

throw 5.0;

}

main() {

try {

f();

} catch(…) { printf(“caught\n”); }

}

this example indeed shows the problem. Let me explain to see if we agree on what
the problem is. Suppose we don’t artificially add catch-alls to selectors. Then
the above example compiles to:

define void @_Z1fv() {

invoke void @__cxa_throw( something ) noreturn
to label %somewhere unwind label %lpad

lpad:
%eh_ptr = tail call i8* @llvm.eh.exception( )
%eh_select8 = tail call i32 (i8*, i8*, …)* @llvm.eh.selector.i32( i8* %eh_ptr, i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*))

I wasn’t advocating this; agree it is wrong.

tail call i32 (…)* @_Unwind_Resume( i8* %eh_ptr )
unreachable

}

define i32 @main() {
entry:
invoke void @_Z1fv( )
to label %somewhere2 unwind label %lpad2

lpad2: ; preds = %entry
%eh_ptr = tail call i8* @llvm.eh.exception( )
%eh_select14 = tail call i32 (i8*, i8*, …)* @llvm.eh.selector.i32( i8* %eh_ptr, i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*), i8* null )
print_a_message_and_exit

}

And this works fine: main calls _Z1fv which throws an exception. Execution branches to lpad where
(empty) cleanup code is run, then unwinding is resumed. The unwinder unwinds into main, and branches
to lpad2 (because the selector has a catch-all, the null) which prints a message and exits.

If the inliner is run, then we get:

define i32 @main() {

invoke void @__cxa_throw( something ) noreturn
to label %somewhere unwind label %lpad.i

lpad.i:
%eh_ptr.i = tail call i8* @llvm.eh.exception( ) ; <i8*> [#uses=2]
%eh_select8.i = tail call i32 (i8*, i8*, …)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*))
invoke i32 (…)* @_Unwind_Resume( i8* %eh_ptr.i )
to label %somewhere2 unwind label %lpad2

lpad2: ; preds = %lpad.i
%eh_ptr = tail call i8* @llvm.eh.exception( ) ; <i8*> [#uses=2]
%eh_select14 = tail call i32 (i8*, i8*, …)* @llvm.eh.selector.i32( i8* %eh_ptr, i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*), i8* null )
print_a_message_and_exit

}

This is perfectly correct given LLVM invoke semantics. Unfortunately the unwinder doesn’t
know about those :slight_smile: When run, the exception is thrown but the unwinder doesn’t branch
to lpad.i because the selector doesn’t state that that (or any) exception should be caught. Thus
the program is terminated.

OK.

If you force a “cleanup” by changing the selector call to:
%eh_select8.i = tail call i32 (i8*, i8*, …)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*), i32 0)
then it doesn’t work either: the unwinder observes that there is only a cleanup, and
using some special logic (bogus in this case) deduces that the exception will be rethrown
after running the cleanup code (and thus the program terminated), so doesn’t bother running
the cleanup code and directly terminates the program (apparently terminating programs quickly
was important to whoever wrote the unwinder, I don’t know why; the Ada unwinder doesn’t do
this for example :slight_smile: ).

OTOH, claiming that everything has a cleanup seems to me a correct description of what the IR code does: control reenters the throwing function to execute a possibly null cleanup, then resumes. The trouble is you can’t simply copy that IR while inlining and expect things to still work, because the operation of Unwind_Resume depends on what stack frame it’s in. I don’t agree that the inlined version is correct IR. The ‘invoke semantics’ you’re talking about are inextricably intertwined with _Unwind_Resume’s semantics.

However if you add a catch-all to the selector instead:
%eh_select8.i = tail call i32 (i8*, i8*, …)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (…)* @__gxx_personality_v0 to i8*), i8* null)
then the unwinder does branch to lpad.i. Then the _Unwind_Resume call causes a branch to
lpad2 and all works perfectly.

This is why I forcably push a catch-all at the end of each selector call: because then if an
exception unwinds through an invoke, control always branches to the landing pad, which is what
LLVM invoke semantics require and the inliner has exploited.

Unfortunately it seems this breaks the Darwin unwinder.

Yes.

It is true that you can imagine a solution in which the inliner knows about selector calls
and shuffles them around. I will think about this. That said, invoke is defined to have
certain semantics, and I don’t much like the idea of trying to do an end-run around them
and around optimizers that exploit them…

I don’t see much choice. I guess I’ll look at getting the inliner to do what I think it should do. (I’ll make it Darwin-specific at first, but it should work on Linux, and let me point out that you’ll get more efficient code this way.)
I guess an easy but undesirable fallback position is to tell the inliner not to inline anything that invokes Unwind_Resume.

Hi Dale,

> ... Suppose we don't artificially add catch-alls to
> selectors. Then
> the above example compiles to:
> ...
I wasn't advocating this; agree it is wrong.

it's maybe not as wrong as it seems since having an empty
selector is equivalent to having a cleanup (IIRC) :slight_smile:

> ... If you force a "cleanup" by changing the selector call to:
> %eh_select8.i = tail call i32 (i8*, i8*, ...)*
> @llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (...)*
> @__gxx_personality_v0 to i8*), i32 0)
> then it doesn't work either: the unwinder observes that there is
> only a cleanup, and
> using some special logic (bogus in this case) deduces that the
> exception will be rethrown
> after running the cleanup code (and thus the program terminated), so
> doesn't bother running
> the cleanup code and directly terminates the program (apparently
> terminating programs quickly
> was important to whoever wrote the unwinder, I don't know why; the
> Ada unwinder doesn't do
> this for example :slight_smile: ).

OTOH, claiming that everything has a cleanup seems to me a correct
description of what the IR code does: control reenters the throwing
function to execute a possibly null cleanup, then resumes.

I agree, and this is why I originally tried to get the desired effect
using cleanups.

The trouble is you can't simply copy that IR while inlining and expect
things to still work, because the operation of Unwind_Resume depends
on what stack frame it's in. I don't agree that the inlined version
is correct IR. The 'invoke semantics' you're talking about are
inextricably intertwined with _Unwind_Resume's semantics.

The semantics of invoke are described in the LangRef. I took the
approach of artificially obtaining these semantics, which have an
impedance mismatch with the gcc unwinder, by playing with the way we
set up the exception table. If you manage to obtain invoke semantics
then you can expect inlining to work. And I did manage to obtain them
on linux - so this was a simple solution that also meant I didn't have
to audit all the optimizers to check how they used invoke (though most
likely the inliner is the only one that matters).

You are suggesting changing the meaning of invoke, a bolder approach
which certainly has some advantages. If I understand right you want
to allow some exceptions to unwind through an invoke without branching
to the unwind label (instead they just keep on unwinding); which
exceptions are caught is to be determined by the eh.selector call.
In other words, you want to model LLVM's EH on the way the gcc unwinder
works. With this model indeed the inliner needs to copy the selector
to the appropriate places when inlining. Perhaps some other optimizers
need to be tweaked too. Unfortunately this approach immediately hits
PR1508 (1508 – typeinfos and personality functions should be attached to the invoke), i.e. the problem
of actually finding the selector for a landing pad. Personally I would
like to see the selector somehow be attached to the unwind edge - perhaps
it could be done as part of PR1269.

> ... It is true that you can imagine a solution in which the inliner
> knows about selector calls
> and shuffles them around. I will think about this. That said,
> invoke is defined to have
> certain semantics, and I don't much like the idea of trying to do an
> end-run around them
> and around optimizers that exploit them...

I don't see much choice. I guess I'll look at getting the inliner to
do what I think it should do. (I'll make it Darwin-specific at
first, but it should work on Linux, and let me point out that you'll
get more efficient code this way.)

Yes, you certainly will get more efficient unwinding. This probably
doesn't matter much for Ada or C++, but I guess some languages out
there like to throw exceptions a lot.

I guess an easy but undesirable fallback position is to tell the
inliner not to inline anything that invokes Unwind_Resume.

If you push a "cleanup" instead of a catch-all on Darwin, does
everything work? This is exactly what you get if you apply your
patch

- lang_eh_catch_all = return_null_tree;
+/* lang_eh_catch_all = return_null_tree;*/

Ciao,

Duncan.

Indeed this makes all the EH tests work on Darwin (llc and llc-beta; cbe and jit are still broken).

... If you force a "cleanup" by changing the selector call to:
      %eh_select8.i = tail call i32 (i8*, i8*, ...)*
@llvm.eh.selector.i32( i8* %eh_ptr.i, i8* bitcast (i32 (...)*
@__gxx_personality_v0 to i8*), i32 0)
then it doesn't work either: the unwinder observes that there is
only a cleanup, and
using some special logic (bogus in this case) deduces that the
exception will be rethrown
after running the cleanup code (and thus the program terminated), so
doesn't bother running
the cleanup code and directly terminates the program (apparently
terminating programs quickly
was important to whoever wrote the unwinder, I don't know why; the
Ada unwinder doesn't do
this for example :slight_smile: ).

OTOH, claiming that everything has a cleanup seems to me a correct
description of what the IR code does: control reenters the throwing
function to execute a possibly null cleanup, then resumes.

I agree, and this is why I originally tried to get the desired effect
using cleanups.

The trouble is you can't simply copy that IR while inlining and expect
things to still work, because the operation of Unwind_Resume depends
on what stack frame it's in. I don't agree that the inlined version
is correct IR. The 'invoke semantics' you're talking about are
inextricably intertwined with _Unwind_Resume's semantics.

The semantics of invoke are described in the LangRef. I took the
approach of artificially obtaining these semantics, which have an
impedance mismatch with the gcc unwinder, by playing with the way we
set up the exception table. If you manage to obtain invoke semantics
then you can expect inlining to work. And I did manage to obtain them
on linux - so this was a simple solution that also meant I didn't have
to audit all the optimizers to check how they used invoke (though most
likely the inliner is the only one that matters).

You are suggesting changing the meaning of invoke, a bolder approach
which certainly has some advantages. If I understand right you want
to allow some exceptions to unwind through an invoke without branching
to the unwind label (instead they just keep on unwinding); which
exceptions are caught is to be determined by the eh.selector call.

No, I don't want to change the semantics of invoke, at least I don't think so.
When inlining, I want the inlined throw to reach cleanup code as it does.
But I want the Unwind_Resume call that ends the cleanup code to be
replaced with a control transfer to the handler (or cleanup) in the calling
function, i.e. the inliner needs to know the semantics of Unwind_Resume.

In other words, you want to model LLVM's EH on the way the gcc unwinder
works.

Invoke's reason for existence is supporting EH, and we can't change the unwinder, so yes, we have to do that. So do you, the difference is our unwinders don't work quite the same.

With this model indeed the inliner needs to copy the selector
to the appropriate places when inlining. Perhaps some other optimizers
need to be tweaked too. Unfortunately this approach immediately hits
PR1508 (1508 – typeinfos and personality functions should be attached to the invoke), i.e. the problem
of actually finding the selector for a landing pad. Personally I would
like to see the selector somehow be attached to the unwind edge - perhaps
it could be done as part of PR1269.

... It is true that you can imagine a solution in which the inliner
knows about selector calls
and shuffles them around. I will think about this. That said,
invoke is defined to have
certain semantics, and I don't much like the idea of trying to do an
end-run around them
and around optimizers that exploit them...

I don't see much choice. I guess I'll look at getting the inliner to
do what I think it should do. (I'll make it Darwin-specific at
first, but it should work on Linux, and let me point out that you'll
get more efficient code this way.)

Yes, you certainly will get more efficient unwinding. This probably
doesn't matter much for Ada or C++, but I guess some languages out
there like to throw exceptions a lot.

I guess an easy but undesirable fallback position is to tell the
inliner not to inline anything that invokes Unwind_Resume.

If you push a "cleanup" instead of a catch-all on Darwin, does
everything work? This is exactly what you get if you apply your
patch

- lang_eh_catch_all = return_null_tree;
+/* lang_eh_catch_all = return_null_tree;*/

No, things are much better than there were, but inlining functions that call Unwind_Resume still causes the problem we've been talking about.
Disabling that as well makes everything work.

Hi Dale,

No, I don't want to change the semantics of invoke, at least I don't
think so.
When inlining, I want the inlined throw to reach cleanup code as it
does.
But I want the Unwind_Resume call that ends the cleanup code to be
replaced with a control transfer to the handler (or cleanup) in the
calling
function, i.e. the inliner needs to know the semantics of Unwind_Resume.

it seems to me that this is extremely tricky to do in general, though it
is simpler if you suppose the IR was produced by gcc. Consider this example:

class A {}; class B {};
int i;
extern void f();
void g() { try { f(); } catch(A) { i = 1; } }
void h() { try { g(); } catch(B) { i = 2; } }

Without catch-alls this compiles to something like:

define void @_Z1gv() {
entry:
  invoke void @_Z1fv( )
      to label %UnifiedReturnBlock unwind label %lpad
...
lpad: ; preds = %entry
  %eh_ptr = tail call i8* @llvm.eh.exception( )
  %eh_select = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i32 (...)* @__gxx_personality_v0, i8* A)
  %eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* A )
  %tmp15 = icmp eq i32 %eh_select, %eh_typeid
  br i1 %tmp15, label %bb, label %Unwind
...
Unwind: ; preds = %lpad
  tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr )
  unreachable
...
}

define void @_Z1hv() {
entry:
  invoke void @_Z1gv( )
      to label %UnifiedReturnBlock unwind label %lpad
...
lpad: ; preds = %entry
  %eh_ptr = tail call i8* @llvm.eh.exception( )
  %eh_select = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i32 (...)* @__gxx_personality_v0, i8* B )
  %eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* B )
  %tmp15 = icmp eq i32 %eh_select, %eh_typeid
  br i1 %tmp15, label %bb, label %Unwind
...
Unwind: ; preds = %lpad
  tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr )
  unreachable
...
}

Currently when you inline you get something like:

define void @_Z1hv() {
entry:
  invoke void @_Z1fv( )
      to label %UnifiedReturnBlock2 unwind label %lpad.i
...
lpad.i: ; preds = %entry
  %eh_ptr.i = tail call i8* @llvm.eh.exception( )
  %eh_select.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i32 (...)* @__gxx_personality_v0 , i8* A )
  %eh_typeid.i = tail call i32 @llvm.eh.typeid.for.i32( i8* A )
  %tmp15.i = icmp eq i32 %eh_select.i, %eh_typeid.i
  br i1 %tmp15.i, label %bb.i, label %Unwind.i
...
Unwind.i: ; preds = %lpad.i
  invoke i32 (...)* @_Unwind_Resume( i8* %eh_ptr.i )
      to label %UnifiedUnreachableBlock unwind label %lpad ; <i32>:0 [#uses=0]
...
lpad: ; preds = %Unwind.i
  %eh_ptr = tail call i8* @llvm.eh.exception( )
  %eh_select = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i32 (...)* @__gxx_personality_v0 , i8* B )
  %eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* B )
  %tmp15 = icmp eq i32 %eh_select, %eh_typeid
  br i1 %tmp15, label %bb, label %Unwind
...
Unwind: ; preds = %lpad
  tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr ) ; <i32>:1 [#uses=0]
  unreachable
...
}

However to get correct functioning the following adjustments have to be made:
(1) B has to be appended to the selector call for A:
  %eh_select.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i32 (...)* @__gxx_personality_v0 , i8* A )
->
  %eh_select.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i32 (...)* @__gxx_personality_v0 , i8* A , i8* B )
Otherwise if a B is thrown in f then the program will terminate. Here the main difficulty is finding the selector for the A landing pad.

(2) The Unwind_Resume call needs to be turned into a jump to the handler code for the B case (this is half-way through lpad), something like:

Unwind.i: ; preds = %lpad.i
  ; was an invoke of @_Unwind_Resume here
  ; was a call to @llvm.eh.exception here
  ; was a call to @llvm.eh.selector.i32 here
  %eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* B )
  %tmp15 = icmp eq i32 %eh_select, %eh_typeid
  br i1 %tmp15, label %bb, label %Unwind

This can all go wrong in several ways:
(a) if the A landing pad already (for some reason) tested for B then step (1) will
cause strangeness. However I think we can say that the code was relying on undefined
behaviour and not worry about this.
(b) there needs to be some analysis to find @_Unwind_Resume calls reachable from the
A landing pad. They may also be reachable from other landing pads, so code duplication
may be required. This could get complicated.
(c) the selector call for the Unwind_Resume invoke needs to be determined and the result
of the (modified) A selector call needs to be used instead. Since the selector may be
shared by several landing pads this could get tricky too.

To my mind a perfect solution would be:
(i) Find a trick (like my catch-all trick) so that invokes always branch to the unwind
label when an exception unwinds through it. In other words, preserve the traditional
semantics of invoke. This makes life much simpler at the level of the IR optimizers.
(ii) Add a pass that knows about Unwind_Resume and does the kind of transform described
above when it isn't too hard. If it is too hard then it can give up because of (i).
(iii) At codegen time, completely abandon invoke semantics (invoke doesn't exist there
anyway) and exploit the way the unwinder works as much as possible.

I've applied the darwin CFA unwinder change and will see if I can find a way of getting (i).

Ciao,

Duncan.

How about this?

Index: gcc-4.2.llvm/gcc/except.c

Hi Dale,

No, I don't want to change the semantics of invoke, at least I don't
think so.
When inlining, I want the inlined throw to reach cleanup code as it
does.
But I want the Unwind_Resume call that ends the cleanup code to be
replaced with a control transfer to the handler (or cleanup) in the
calling
function, i.e. the inliner needs to know the semantics of Unwind_Resume.

it seems to me that this is extremely tricky to do in general, though it
is simpler if you suppose the IR was produced by gcc. Consider this example:

Yes, working my way through the tests I've hit several optimizer issues. Your example is
even worse though:) In general the eh.exception stuff needs special handling beyond
what I said.

To my mind a perfect solution would be:
(i) Find a trick (like my catch-all trick) so that invokes always branch to the unwind
label when an exception unwinds through it. In other words, preserve the traditional
semantics of invoke. This makes life much simpler at the level of the IR optimizers.
(ii) Add a pass that knows about Unwind_Resume and does the kind of transform described
above when it isn't too hard. If it is too hard then it can give up because of (i).
(iii) At codegen time, completely abandon invoke semantics (invoke doesn't exist there
anyway) and exploit the way the unwinder works as much as possible.

I've applied the darwin CFA unwinder change and will see if I can find a way of getting (i).

Thanks. I did find a sledgehammer that works: replace the catch-all with a cleanup,
and don't inline anything that calls Unwind_Resume. I'm trying to come up with something
less stupid.

Index: gcc-4.2.llvm/gcc/except.c

--- gcc-4.2.llvm.orig/gcc/except.c 2007-12-12 20:55:30.000000000 +0100
+++ gcc-4.2.llvm/gcc/except.c 2007-12-12 20:56:19.000000000 +0100
@@ -4053,9 +4053,9 @@
{
  /* The default c++ routines aren't actually c++ specific, so use those. */
  /* LLVM local begin */
- llvm_unwind_resume_libfunc = llvm_init_one_libfunc ( USING_SJLJ_EXCEPTIONS ?
- "_Unwind_SjLj_Resume"
- : "_Unwind_Resume");
+ llvm_unwind_resume_libfunc =
+ llvm_init_one_libfunc(USING_SJLJ_EXCEPTIONS ?
+ "_Unwind_SjLj_Resume" : "_Unwind_Resume_or_Rethrow");
  /* LLVM local end */
}

Seems unlikely, but I'll try it.

This works! and from the source it is reasonable that it should work.
Well done, I never thought of finding another hook into the unwinder, was
trying to duplicate gcc's behavior, which is hard. Thanks very much.