I hadn’t noticed the “noexcept” specifier in your example. That clears up part of my concerns, but I still have some problems.
With regard to the multiple meanings of ‘resume’ I am more concerning about developers who are reading the IR understanding it than about passes operating on it. Apart from making it harder to debug problems related to control flow at resume instructions I think this makes it more likely that code which mishandles it will be introduced down the road. If I’m reading things correctly, a resume instruction in your proposal could mean:
a) We’re done handling this exception, continue normal execution at this label.
b) We’re done handling this exception, continue execution in an enclosing catch handler at this label.
c) We’re done executing this termination handler, check the catch handler at this label to see if it can handle the current exception.
d) We’re done executing this termination handler, now execute the termination handler at this label.
e) We’re done executing this termination handler, continue handling the exception in the runtime.
I suppose (a) and (b) are more or less the same and it doesn’t entirely matter whether the destination is in normal code or exception code. In practical terms (c) and (d) may be the same also, but logically, in terms of how the runtime works, they are different. I’m pretty sure there’s a gap in my understanding of your proposal because I don’t understand how e() is represented at all.
As an exercise, I tried to work through the IR that would be produced in the non-optimized case for the following code:
void test() {
try {
Obj o1;
try {
f();
} catch (int) {}
Obj o2;
try {
g();
} catch (int) {}
h();
} catch (int) {}
}
Here’s what I came up with:
define void @foo() personality i32 (…)* @__CxxFrameHandler3 {
%e.addr = alloca i32
invoke void @f(i32 1)
to label %cont1 unwind label %cleanup.Obj
cont1:
invoke void @g(i32 2)
to label %cont2 unwind label %cleanup.Obj.1
cont2:
invoke void @h(i32 2)
to label %cont3 unwind label %cleanup.Obj.2
cont3:
call void @~Obj()
call void @~Obj()
br label %return
return:
ret void
cleanup.Obj:
cleanupblock unwind label %maycatch.int
call void @~Obj()
resume label %maycatch.int
maycatch.int:
catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]
to label %catch.int unwind label %catchend
catch.int:
resume label %cont1
catchend:
resume
cleanup.Obj.1:
cleanupblock unwind label %maycatch.int.1
call void @~Obj()
call void @~Obj()
resume label %maycatch.int.1
maycatch.int.1:
catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]
to label %catch.int.1 unwind label %catchend.1
catch.int.1:
resume label %cont2
catchend.1:
resume
cleanup.Obj.2:
cleanupblock unwind label %maycatch.int.2
call void @~Obj()
call void @~Obj()
resume label %maycatch.int.2
maycatch.int.2:
catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]
to label %catch.int.2 unwind label %catchend.2
catch.int.2:
resume label %return
catchend.2:
resume
}
I don’t know if I got that right, but it seems to me that there are a couple of problems with this. Most obviously, there is a good bit of duplicated code here (which the optimization passes will probably want to combine).
More significantly though is that it doesn’t correctly describe what happens if a non-int exception is thrown in any of the called functions. For instance, if a non-int exception is thrown from g() that is caught somewhere further down the stack, the runtime should call a terminate handler that destructs o1 and then call a terminate handler that destructs o2. However, my IR doesn’t describe a terminate handler that destructs just o2 and I don’t know how I could get it to do so within the scheme that you have proposed.
Do you have a way to handle this case that I haven’t perceived?
In a mostly unrelated matter, have you thought about what needs to be done to prevent catchblock blocks from being combined? For example, suppose you have code that looks like this:
void test() {
try {
f();
} catch (int) {
x();
y();
z();
}
try {
g();
} catch (…) {
}
try {
h();
} catch (int) {
x();
y();
z();
}
}
I think it’s very likely that if we don’t do anything to prevent it the IR generated for this will be indistinguishable from the IR generated for this:
void test() {
try {
f();
try {
g();
} catch (…) {
}
h();
} catch (int) {
x();
y();
z();
}
}
In this case that might be OK, but theoretically the calls to f() and h() should get different states and there are almost certainly cases where failing to recognize that will cause problems. What’s more, the same basic pattern arises for this case:
void test() {
try {
f();
} catch (int) {
x();
y();
z();
}
try {
g();
} catch (float) {
}
try {
h();
} catch (int) {
x();
y();
z();
}
}
But in this case, if we get the state numbering wrong an int-exception from g() could end up being incorrectly caught by the xyz handler.
BTW, finding cases like this is the primary reason that I’ve been trying to push my current in-flight patch onto the sinking ship that is our current implementation. I mentioned to you before that the test suite I’m using passes with my proposed patch, but that’s only true with optimizations disabled. With optimizations turned on I’m seeing all kinds of fun things like similar handlers being combined and common instructions being hoisted above a shared(!) eh_begincatch call in if-else paired handlers. I don’t know if it will be worth trying to fix these problems, but seeing them in action has been very instructive.
-Andy