LLVMdev Digest, Vol 85, Issue 50

What? Yet another EH proposal?! This one is different from the others in that

I’m planning to start implementing this shortly. But I want your feedback! I’ve

all ready gotten a lot of feedback from Chris, John, Jim, Eric, and many others.

Now is your turn!

Please read this proposal and send me your comments, suggestions, and concerns.

-bw

Bill,

  1. it is good to see that the “exception regions” idea has been abandoned, it is mathematically
    inconsistent with modern optimization theory, and at best would require extra passes to translate
    into/outof that representation form.

  2. it is good to see the prohibition of mixing by inlining of exception handing from different languages,
    it has always been my thought that this cannot be well defined in general, different languages not
    only have different control-flow semantics for exception handling, but the different type systems
    (what is a derived type and what constitutes “type match” for exceptions) are not always going to be
    compatible.

  3. the other non-cosmetic portion of this proposal boils down to:
    a) every invoke must point to a landingpad-block
    b) every landingpad-block must start with a catch specification (LandingpadInst)
    c) catch specifications (LandingpadInst) must not occur anywhere else

don’t take “cosmetic” as a criticism, I think gcc suffers from bad cosmetics, and llvm benefits
from good cosmetics. The logic I see is that we already have PHINode and TerminateInst
that have explicit restrictions, so it makes sense that if catch-specifications have restrictions
they too should be Instructions rather than Intrinsics.

3.b) I have been thinking about other possible control-flow-graph invariants of the
landingpad blocks and the catch blocks that they lead to, but so far have not come up
with very muchl, I wonder if anyone else is thinking about this…?..

for example cleanups come before __cxa_begin_catch, but it isn’t clear what is a cleanup
and what isn’t other than what comes before a __cxa_begin_catch and what comes after ?

however, using that as the definition of cleanup, for C++ any InvokeInst that is so
identified as cleanup then its only operand has to be terminate (I think, someone
please correct me if I’ve made an incorrect conclusion here).

3.c) I have been thinking about whether the original source code structure of try-catch
statements can be reconstructed from the IR, are two try-catches nested, either in the
try or the catch part, or are they disjoint, and can the cleanups be identified as such at
the IR level or have things potentially been mixmastered up too much after optimization.
I wonder if anyone else is thinking about this also…?..

  1. IIUC, llvm has inherited a bug from gcc where the debugger cannot let the user know an exception is
    going to be uncaught until after the stack has been unwound – contrary to the design intentions of the
    unwind library that most exception implementations are based on (with a two phase unwind algorithm) –
    which creates a problem for the debugger user.

so, the question is will there be a specific recognizable “catch all types” type that can occur in the
landingpad’s catch list ?

and will there be a __llvm_personality_v0 that is designed to do the right thing for this case.

yes, I know this is a can-of-worms, it will break gcc compatibility, but then perhaps we can be the
motivation for gnu folks to fix their implementation, be the leader rather than the follower.!.

4.b) it is not at all clear from your write up what the “cleanup” option for a landingpad is, and
how this is used when both cleanups AND catches are necessary in a given try-catch source
code statement, including if one of the user specified catches is a catch-all.

  1. its not clear from your email what is done with the result value of the landingpad instruction,
    but I presume that your intent is that this does not change from the current scheme where
    the “llvm.eh.typeid.for()” is called and its result is compared with the landingpad instruction’s
    result…

…and then a miracle happens in CodeGen, and most of the intrinsics are thrown away and the
hard register contents at the resumption at a landingpad from an Unwind include the value that
llvm.eh.typeid.for() would have returned…

this is the sort of thing I’m talking about when I imply that the current scheme is poorly documented!

Also, what is going to happen for the case of cleanup AND catches, currently the result of not
only the llvm.eh.select() result is cached, but in fact the complete decoding of it relative to
all the llvm.eh.typeid.for() calls is cached, then the cleanup code executed, THEN finally the
already decoded value is used to “switch” from the landing pad to the correct catch-block.

who is going to generate all that code, is it still going to be explicit in the IR, or is CodeGen going
to now be responsible creating it.

  1. it would be nice if the existing UnwindInst could be retained. I wince at naming an instruction
    “Resume” since in the English language it is so ambiguous (resume normal execution following
    the conclusion of handing an exception, verses resume throwing an exception). IE cosmetics
    do matter.

  2. there are still lots of other intrinsics/routines involved:
    __cxa_allocate_exception
    __cxa_throw, cxa_rethrow
    __cxa_begin_catch(), __cxa_end_catch
    although these particular ones seem to be the easiest to document as they do seem to be
    translated verbatim (no CodeGen miracles).

  3. I really like the idea of “terminate” being one of the options to the landingpad
    instruction, it makes identification of abnormal code more direct (otherwise control-
    flow analysis has to be done to see if __terminate() is reachable to conclude that
    something is abnormal code, and I really don’t like that analysis, it seems too error-
    prone as __terminate() might be reachable for other reasons (not that I have come
    up with such a scenario yet, but I think I might be able to), and this conclusion would
    then be ambiguous).

Even if support for the terminate option required a new __llvm_personality_v0 and
a new Unwind library function, I am still in favor of having and using it. But I suspect
that CodeGen can lower this into the same old MC branch to a block that only contains
__terminate() that we currently see in IR, and a new personality and Unwind
aren’t necessary, but would still be a nice optimization.

sincerely,
Peter Lawrence.

3.b) I have been thinking about other possible control-flow-graph invariants of the
landingpad blocks and the catch blocks that they lead to, but so far have not come up
with very muchl, I wonder if anyone else is thinking about this...?...

for example cleanups come before __cxa_begin_catch, but it isn't clear what is a cleanup
and what isn't other than what comes before a __cxa_begin_catch and what comes after ?

The EH representation is independent of things like this.

however, using that as the definition of cleanup, for C++ any InvokeInst that is so
identified as cleanup then its only operand has to be terminate (I think, someone
please correct me if I've made an incorrect conclusion here).

In C++, any destructor call executed as an EH cleanup would need to be
an invoke whose unwind edge leads to a landing pad with a catch-all and
a call to std::terminate(). However, after inlining etc., I don't know that this
gives us any interesting invariants in the IR.

3.c) I have been thinking about whether the original source code structure of try-catch
statements can be reconstructed from the IR, are two try-catches nested, either in the
try or the catch part, or are they disjoint, and can the cleanups be identified as such at
the IR level or have things potentially been mixmastered up too much after optimization.
I wonder if anyone else is thinking about this also...?...

It would be difficult to reliably reconstruct try/catch statements from the IR
even before optimization.

4) IIUC, llvm has inherited a bug from gcc where the debugger cannot let the user know an exception is
going to be uncaught until after the stack has been unwound -- contrary to the design intentions of the
unwind library that most exception implementations are based on (with a two phase unwind algorithm) --
which creates a problem for the debugger user.

I don't see this as a compiler bug. I can't imagine any personality function
design which would let debuggers interrupt or control unwinding without
hooking libUnwind, short of requiring every single call to have an
associated landing pad which the personality always lands at, even if
there's nothing to do there. That will never, ever be acceptable.

and will there be a __llvm_personality_v0 that is designed to do the right thing for this case.

yes, I know this is a can-of-worms, it will break gcc compatibility, but then perhaps we can be the
motivation for gnu folks to fix their implementation, be the leader rather than the follower.!.

Using our own personality function would not necessarily break GCC
compatibility; we'd just need to provide it in compiler-rt or something.

4.b) it is not at all clear from your write up what the "cleanup" option for a landingpad is, and
how this is used when both cleanups AND catches are necessary in a given try-catch source
code statement, including if one of the user specified catches is a catch-all.

The 'cleanup' bit says that the personality function needs to land
there even if there's no handler. And yes, it's technically redundant
with a catch-all handler.

5) its not clear from your email what is done with the result value of the landingpad instruction,
but I presume that your intent is that this does not change from the current scheme where
the "llvm.eh.typeid.for()" is called and its result is compared with the landingpad instruction's
result...

...and then a miracle happens in CodeGen, and most of the intrinsics are thrown away and the
hard register contents at the resumption at a landingpad from an Unwind include the value that
llvm.eh.typeid.for() would have returned...

The miracle is just that llvm.eh.typeid.for are replaced with constant values
after all interprocedural optimizations are finished. Unfortunately, since
the range of constants is global over the function, there is no other
reasonable way to do this while maintaining correctness across inlining
and dead code elimination.

Also, what is going to happen for the case of cleanup AND catches, currently the result of not
only the llvm.eh.select() result is cached, but in fact the complete decoding of it relative to
all the llvm.eh.typeid.for() calls is cached, then the cleanup code executed, THEN finally the
already decoded value is used to "switch" from the landing pad to the correct catch-block.

who is going to generate all that code, is it still going to be explicit in the IR, or is CodeGen going
to now be responsible creating it.

It will still be explicit in the IR.

6) it would be nice if the existing UnwindInst could be retained. I wince at naming an instruction
"Resume" since in the English language it is so ambiguous (resume normal execution following
the conclusion of handing an exception, verses resume throwing an exception). IE cosmetics
do matter.

I would be fine with still calling resume "unwind", but the new instruction
does need to carry extra information.

7) there are still lots of other intrinsics/routines involved:
  __cxa_allocate_exception
  __cxa_throw, cxa_rethrow
  __cxa_begin_catch(), __cxa_end_catch
although these particular ones seem to be the easiest to document as they do seem to be
translated verbatim (no CodeGen miracles).

These are not intrinsics, and it's not our responsibility to document them.
If you're borrowing the Itanium C++ EH routines to implement exceptions
in your own language, then you need to understand how Itanium C++ EH
works, and you should read their documentation.

8) I really like the idea of "terminate" being one of the options to the landingpad
instruction, it makes identification of abnormal code more direct (otherwise control-
flow analysis has to be done to see if __terminate() is reachable to conclude that
something is abnormal code, and I really don't like that analysis, it seems too error-
prone as __terminate() might be reachable for other reasons (not that I have come
up with such a scenario yet, but I think I might be able to), and this conclusion would
then be ambiguous).

_gxx_personality_v0 can only do its special-case terminate encoding
in the LSDA if that's the only possible handler. That means that, for
correctness under inlining, front-ends targeting that personality will
still always need their landing pads to contain explicit calls to
std::terminate().

John.

It should not be called "unwind" since it is different than the old thing. I would be supportive of "resume_unwind" or something like that though.

-Chris

For what it's worth, it serves exactly the same purpose as the old thing, except actually possible to reliably implement.

John.

The documented semantics of the old thing was that it "starts a new unwinding process". The new thing is much more reasonably a "resume an inflight unwinding process".

-Chris

1) it is good to see that the "exception regions" idea has been abandoned, it is mathematically
inconsistent with modern optimization theory, and at best would require extra passes to translate
into/outof that representation form.

Yeah. I didn't want to obscure the main proposal by inappropriate nomenclature.

3.b) I have been thinking about other possible control-flow-graph invariants of the
landingpad blocks and the catch blocks that they lead to, but so far have not come up
with very muchl, I wonder if anyone else is thinking about this...?...

for example cleanups come before __cxa_begin_catch, but it isn't clear what is a cleanup
and what isn't other than what comes before a __cxa_begin_catch and what comes after ?

As John mentioned, the EH representation is independent (and indeed ignorant) of things like this. The front-ends need to generate the correct code.

4.b) it is not at all clear from your write up what the "cleanup" option for a landingpad is, and
how this is used when both cleanups AND catches are necessary in a given try-catch source
code statement, including if one of the user specified catches is a catch-all.

It's something I noticed from GCC's exception handling tables. If there's a cleanup that's been inlined, then even if that cleanup has "catches", it still is marked as a cleanup. As John mentioned, it's so that the personality function knows to stop at that function to run the cleanup.

5) its not clear from your email what is done with the result value of the landingpad instruction,
but I presume that your intent is that this does not change from the current scheme where
the "llvm.eh.typeid.for()" is called and its result is compared with the landingpad instruction's
result...

The values the landingpad returns are those that are set by the personality function upon re-entry into the function. On X86, it's the EAX and EDX registers. One of those values is a pointer to the exception handling object. The other is a "selector" value, that we can then use to determine which (if any) of the clauses should be run.

...and then a miracle happens in CodeGen, and most of the intrinsics are thrown away and the
hard register contents at the resumption at a landingpad from an Unwind include the value that
llvm.eh.typeid.for() would have returned...

The llvm.eh.typeid.for is a hold-over from the old design. It's returns a constant value that can be compared against the "selector" the personality function returns. It remains because it gives an explicit representation of how the decision table of which catch to call is executed. It's similar to a series of if-then-elses.

this is the sort of thing I'm talking about when I imply that the current scheme is poorly documented!

Indeed! And one of the outcomes will be much better documentation.

Also, what is going to happen for the case of cleanup AND catches, currently the result of not
only the llvm.eh.select() result is cached, but in fact the complete decoding of it relative to
all the llvm.eh.typeid.for() calls is cached, then the cleanup code executed, THEN finally the
already decoded value is used to "switch" from the landing pad to the correct catch-block.

who is going to generate all that code, is it still going to be explicit in the IR, or is CodeGen going
to now be responsible creating it.

It will be explicit in the IR, as it is now. :slight_smile:

6) it would be nice if the existing UnwindInst could be retained. I wince at naming an instruction
"Resume" since in the English language it is so ambiguous (resume normal execution following
the conclusion of handing an exception, verses resume throwing an exception). IE cosmetics
do matter.

The UnwindInst carries too much history with it to remain, and the new behavior is different than what the 'unwind' instruction. I agree with Chris that a mix of something like "resumeunwind" would make more sense.

7) there are still lots of other intrinsics/routines involved:
  __cxa_allocate_exception
  __cxa_throw, cxa_rethrow
  __cxa_begin_catch(), __cxa_end_catch
although these particular ones seem to be the easiest to document as they do seem to be
translated verbatim (no CodeGen miracles).

It would involve hard-coding language-specific calls and ABIs into LLVM. That's something we try to avoid.

8) I really like the idea of "terminate" being one of the options to the landingpad
instruction, it makes identification of abnormal code more direct (otherwise control-
flow analysis has to be done to see if __terminate() is reachable to conclude that
something is abnormal code, and I really don't like that analysis, it seems too error-
prone as __terminate() might be reachable for other reasons (not that I have come
up with such a scenario yet, but I think I might be able to), and this conclusion would
then be ambiguous).

Even if support for the terminate option required a new __llvm_personality_v0 and
a new Unwind library function, I am still in favor of having and using it. But I suspect
that CodeGen can lower this into the same old MC branch to a block that only contains
__terminate() that we currently see in IR, and a new personality and Unwind
aren't necessary, but would still be a nice optimization.

After we get the basic functionality down, we can discuss further changes like this. It's good to keep it in mind, though. As John mentioned, it may not be suitable in all cases, but for some it's a potential win.

-bw

John,
I’m not able to figure out what you’re really trying to say here. I am suggesting that
there be a unique function that libUnwind calls in the event it detects that an exception
is going to go uncaught all the way out past main, and that the user be able to set a
break-point on that function (it could be the existing function “terminate”, or a new one
created just for this one purpose), so that the stack can be examined before it gets
unwound.

I’m not sure what you mean by “hooking”, “interrupting”, or “controlling”. I am just
suggesting to be allowed to set a break-point on some unique function.

I finally dug deeper into the issue and figured out this is actually a problem with DWARF
encoding, or the way that type info is encoded by GCC into DWARF and decoded by
__gcc_personality from DWARF. (there is a comment somewhere in the LLVM documentation
that IIRC seems to imply the problem is with the Types parameters to llvm.eh.select, but
that is incorrect, the problem is deeper than that, it is with the underlying DWARF tables).

In short the problem is that there is an ambiguity between a cleanup handler having
an Action Table entry that looks like
.byte 1 ;; Type = 1 (ie #1 entry in Types Table)
.byte 0 ;; Next = 0 (ie none, ie this is the list terminator for this try-statement)
together with a corresponding Types Table entry #1 that looks like
.long 0 ;; RTTI pointer == NULL

and a user explicit try-catchall statement which also contains the exact same DWARF
encoding.

Instead a user explicit catch-all should have an explicit entry in the Types Table (perhaps
“void” could be the “user explicit match anything” marker) rather than containing the NULL value.

Right now “cleanups” look to __gcc_personality exactly like “user explicit catch-all”, so there
is no way for __gcc_personality to tell that something will not be caught (if an exception will
only go through cleanups all the way out past main, it “looks” to __gcc_personality that it is
actually being caught by "catch-all"s, so __gcc_personality currently cannot figure this out
until after the stack is entirely unwound and the user is then SOL).

Peter Lawrence.

Bill,
something to laugh about…

I had originally mis-read the llvm eh doc concerning llvm.eh.selector and llvm.eh.typeid.for,

they are clearly documented as returning something like an index into a type table (in our
case specifically a DWARF Types Table index),

but I had mis-read llvm.eh.selector to mean it returned the index / ordinal of which parameter
in its parameter list (not which type in the Type Table) was a match.

this lead to substantial confusion on my part about what “magic” was taking place during CodeGen.

-Peter Lawrence.

This is not a cleanup. In the LSDAs for GCC's family of personalities, cleanups
by definition have a zero index into the types table. I think I see your confusion,
though.

LLVM-GCC and Clang used to share a bug where cleanups would sometimes
be emitted as catch-alls as a workaround to a flaw in the inliner. I fixed this
flaw (or to be specific, I worked around it to the best of my ability given the
constraints of the broken eh.exception/eh.selector representation) sometime
in June, but part of the fix requires the front-end to emit different code, and I've
only updated Clang to do so because I can't touch dragonegg and LLVM-GCC
is dead.

AFAIK, GCC never had this bug.

John.

John,
            I'm still not sure what you're talking about, I have included the assembly
output from two compilations, one with a user explicit catch-all, one with only an
implicit cleanup, the DWARF Action Table and Types Table are absolutely identical,
as are the indexes used to reference the Action Table from the region maps.

-Peter Lawrence.

What compiler are you talking about, and on what platform?

The results I'm seeing clearly have both gcc and clang on Darwin generating
different LSDAs for your cleanup examples and your catch-all examples.

Here is the output I see from gcc-4.2 for your cleanup example:

  .text
.globl __Z3barv
__Z3barv:
LFB2:
  pushq %rbp
LCFI0:
  movq %rsp, %rbp
LCFI1:
  pushq %rbx
LCFI2:
  subq $40, %rsp
LCFI3:
  leaq -17(%rbp), %rdi
  call __ZN3BobC1Ev
  leaq -18(%rbp), %rdi
  call __ZN3BobC1Ev
  leaq -19(%rbp), %rdi
  call __ZN3BobC1Ev
LEHB0:
  call __Z3foov
LEHE0:

<snip>

  .section __TEXT,__gcc_except_tab
GCC_except_table0:
LLSDA2:
  .byte 0xff
  .byte 0xff
  .byte 0x3
  .byte 0x1a
  .set L$set$0,LEHB0-LFB2 # from
  .long L$set$0
  .set L$set$1,LEHE0-LEHB0
  .long L$set$1
  .set L$set$2,L6-LFB2
  .long L$set$2
  .byte 0x0

i.e. the range of instructions covering the call to foo() has an action table
index of 0, meaning a cleanup.

Here is the output of ToT clang on this code:

__Z3barv: ## @_Z3barv
Ltmp5:
  .cfi_startproc
  .cfi_personality 155, ___gxx_personality_v0
Leh_func_begin0:
  .cfi_lsda 16, Lexception0
## BB#0: ## %entry
  pushq %rbp
Ltmp6:
  .cfi_def_cfa_offset 16
Ltmp7:
  .cfi_offset %rbp, -16
  movq %rsp, %rbp
Ltmp8:
  .cfi_def_cfa_register %rbp
  subq $80, %rsp
  leaq -8(%rbp), %rdi
  callq __ZN3BobC1Ev
  leaq -16(%rbp), %rdi
  callq __ZN3BobC1Ev
  leaq -24(%rbp), %rdi
  callq __ZN3BobC1Ev
Ltmp0:
  callq __Z3foov
Ltmp1:

<snip>

  .section __TEXT,__gcc_except_tab
  .align 2
GCC_except_table0:
Lexception0:
  .byte 255 ## @LPStart Encoding = omit
  .byte 155 ## @TType Encoding = indirect pcrel sdata4
  .byte 156 ## @TType base offset
  .space 1
  .byte 3 ## Call site Encoding = udata4
  .byte 26 ## Call site table length
                                        ## >> Call Site 1 <<
                                        ## Call between Ltmp0 and Ltmp1
                                        ## jumps to Ltmp2
                                        ## On action: cleanup
Lset0 = Ltmp0-Leh_func_begin0
  .long Lset0
Lset1 = Ltmp1-Ltmp0
  .long Lset1
Lset2 = Ltmp2-Leh_func_begin0
  .long Lset2
  .byte 0

Same thing.

For contrast, here is the result from gcc-4.2 on your catch-all code:

.globl __Z3barv
__Z3barv:
LFB2:
  pushq %rbp
LCFI0:
  movq %rsp, %rbp
LCFI1:
LEHB0:
  call __Z3foov
LEHE0:

<snip>

.section __TEXT,__gcc_except_tab
  .align 2
GCC_except_table0:
LLSDA2:
  .byte 0xff
  .byte 0x9b
  .byte 0x25
  .byte 0x3
  .byte 0x1a
  .set L$set$0,LEHB0-LFB2
  .long L$set$0
  .set L$set$1,LEHE0-LEHB0
  .long L$set$1
  .set L$set$2,L6-LFB2
  .long L$set$2
  .byte 0x1 # <-- a non-zero index into the action table
  .set L$set$3,LEHB1-LFB2
  .long L$set$3
  .set L$set$4,LEHE1-LEHB1
  .long L$set$4
  .long 0x0
  .byte 0x0
  .byte 0x1 # <-- first entry (index=1) in the action table
  .byte 0x0
  .align 2
  .long 0 # <-- the first entry (index=1) in the types table, a catch-all

And from ToT Clang:

__Z3barv: ## @_Z3barv
Ltmp5:
  .cfi_startproc
  .cfi_personality 155, ___gxx_personality_v0
Leh_func_begin0:
  .cfi_lsda 16, Lexception0
## BB#0: ## %entry
  pushq %rbp
Ltmp6:
  .cfi_def_cfa_offset 16
Ltmp7:
  .cfi_offset %rbp, -16
  movq %rsp, %rbp
Ltmp8:
  .cfi_def_cfa_register %rbp
  subq $32, %rsp
Ltmp0:
  callq __Z3foov
Ltmp1:

  .section __TEXT,__gcc_except_tab
  .align 2
GCC_except_table0:
Lexception0:
  .byte 255 ## @LPStart Encoding = omit
  .byte 155 ## @TType Encoding = indirect pcrel sdata4
  .byte 162 ## @TType base offset
  .space 2,128
  .space 1
  .byte 3 ## Call site Encoding = udata4
  .byte 26 ## Call site table length
                                        ## >> Call Site 1 <<
                                        ## Call between Ltmp0 and Ltmp1
                                        ## jumps to Ltmp2
                                        ## On action: 1
Lset0 = Ltmp0-Leh_func_begin0
  .long Lset0
Lset1 = Ltmp1-Ltmp0
  .long Lset1
Lset2 = Ltmp2-Leh_func_begin0
  .long Lset2
  .byte 1
                                        ## >> Call Site 2 <<
                                        ## Call between Ltmp1 and Leh_func_end0
                                        ## has no landing pad
Lset3 = Ltmp1-Leh_func_begin0
  .long Lset3
Lset4 = Leh_func_end0-Ltmp1
  .long Lset4
  .long 0
  .byte 0
                                        ## >> Action Record 1 <<
                                        ## Catch TypeInfo 1
                                        ## No further actions
  .byte 1
  .byte 0
                                        ## >> Catch TypeInfos <<
  .long 0 ## TypeInfo 1

John.

John,
I have been using the first official release llvm-2.9 tarball and a clang from very shortly after that,
built for my PowerPC-Apple-Darwin,

I am a bit surprised that this would have changed since then, but pleasantly surprised nevertheless.

Peter Lawrence.

Hi Peter,

It’s inconsistencies like this that prompted the rewrite. :slight_smile:

-bw