Implementing try/catch/finally

I'd be interested if anyone has some advice on the best way to represent a try/catch/finally statement in LLVM IR.

Assume for the moment that we're using the Python semantics for try/catch. According to the Python language specification, the 'finally' clause is executed whenever the flow of control leaves the 'try' block.

After the 'finally' clause has finished, the flow of control will continue at different points depending on how the 'finally' block was entered. There are basically 5 different cases:

  -- If the flow of control fell off the end of the try body, or the exception was caught, then after the 'finally' is finished execution will continue at the statement after the finally statement.
  -- If the exception was not handled, once the finally statement is finished the exception will be re-thrown.
  -- A return statement was executed within the try block. After 'finally', the function returns.
  -- A break statement was executed within the try block, and the innermost loop is outside of the try block.
  -- As above, but a continue statement.

So in the most complex case, the basic block at the end of the 'finally' statement may have as many as 5 possible successors (or more -- if there are multiple return statements within the try body, and you don't feel like messing with phi statements, then it may be easier to consider each return statement as a separate assignment.)

One approach would be to simply duplicate the code in the 'finally' block for each exit, but that seems sub-optimal. It would be better, I think, to set a state variable before entering the 'finally' block, and then have it do a switch instruction at the end and transfer to the appropriate block.

But I wonder if perhaps it couldn't be better than that. It would seem more efficient, I would think, to be able to pass into the finally block the address of where to resume execution. Or treat it like a kind of local subroutine (i.e. not a full-fledged function with its own local variables and stack frame, but more like a simple jsr instruction.)

Anyway, just musing on different possibilities and wondering if anyone has any suggestions...

One approach would be to simply duplicate the code in the 'finally'
block for each exit, but that seems sub-optimal. It would be better, I
think, to set a state variable before entering the 'finally' block, and
then have it do a switch instruction at the end and transfer to the
appropriate block.

I think gcc just duplicates the block (see tree-eh.c). I suggest that
to start with you do the same, because it is simple to do, and then once
everything is working well look into improving the code quality.

Ciao,

Duncan.

Duncan Sands wrote:

One approach would be to simply duplicate the code in the 'finally' block for each exit, but that seems sub-optimal. It would be better, I think, to set a state variable before entering the 'finally' block, and then have it do a switch instruction at the end and transfer to the appropriate block.
    
I think gcc just duplicates the block (see tree-eh.c). I suggest that
to start with you do the same, because it is simple to do, and then once
everything is working well look into improving the code quality.

Ciao,

Duncan.
  

Interesting.

After doing some more research, I noticed that Java uses a jsr to implement 'finally', at least according to this document:

  http://java.sun.com/docs/books/jvms/second_edition/html/Compiling.doc.html

-- Talin

That's to prevent the code duplication mentioned above.
(The JVM could handle it via code duplication, too.)

Another way to compile (pseudocode)
  try
    a
  catch
    e1 -> h1
    e2 -> h2
  finally
    f

might be

  code for a
  _f: code for f
  ret

  _h1: code for h1
  jmp _f

  _h2: code for h2
  jmp _f

  exception table:
    Code for a, type e1: _h1
    Code for a, type e2: _h2

This would even keep the normal a-f-return flow in line, with no code
duplication - with the constraint that you have an exception table that
maps code ranges and exception types to jump addresses. (I don't know
what mechanism LLVM uses, but I'd expect it's similar enough.)

Regards,
Jo

Hi Talin,

Talin wrote:

Interesting.

After doing some more research, I noticed that Java uses a jsr to implement 'finally', at least according to this document:

  Java SE Specifications

Indeed, and you can still look at what I did in vmkit (files lib/JnJVM/VMCore/JavaJITOpcodes.cpp and JavaJIT.cpp).

Nicolas

After doing some more research, I noticed that Java uses a jsr
to implement 'finally', at least according to this document:

  Java SE Specifications

In Java 5 most compilers stopped using jsr and now just duplicate the
code in the finally block. It turns out that there are various odd
cases that mess up verification with jsr. ISTR that jsr is deprecated
but I didn't look to see if that is really true.

Tom