Exception Handling Proposal - Second round

Hi all,

Following John's, Duncan's and Bill's proposals about exception
handling, I thought I'd summarise what has been discussed so far.

** The problems we're trying to solve are:

P1. Different languages have different EH concepts and IR needs to be
agnostic (as possible) about that
P2. Inlining and optimisations (currently) destroy the EH semantics
and produce code that can't unwind
P3. Clutter in the IR representation of EH leads to unnecessary
complexity when optimising or inlining
P4. The back-end should have a simple and unified representation on
which to build (different) EH tables

** The key-facts I've collected after re-reading all emails are:

F1. There are different families of EH: zero-cost, SjLj etc and they
should have similar IR representations
F2. Back-ends should know how to implement them or bail out (thus,
representation should be *clear*)
F3. Optimisations should make sure unwinding and normal flow do not overlap
F4. Avoid artificial impositions on basic-block behaviour and
dependency to simplify optimisations
F5. We *must* keep the unwind actions and the order in which they
execute when inlining
F6. Some instructions (such as divide in Java) can throw exceptions
without an explicit dispatch mechanism

There are two quasi-orthogonal proposals to change the EH mechanism:
- Duncan Sands', regarding rules on how to protect the dispatch
mechanism (and preserve actions and their orders) when inlining or
optimising code, and
- Bill Wendling's IR simplification using the "dispatch" mechanism to
better express unwinding flow and ease inlining and optimisations

** Proposal 1: Rules on how to protect the unwind flow (P2, F3, F4, F5)

Current LLVM inlining can create some unreachable blocks that get
optimised away (and shouldn't). Some languages demand that certain
clean-up areas must be executed, others that it must not. Some
libstdc++ code apparently relies on this implementation defined
behaviour. To solve this problem, work arounds were coded to redirect
flow to catch-all regions, that created other problems, etc.

Instead of running around in circles, the following rules must be
observed when inlining/optimising:
- When inlining a dispatch area, the inlined block must resume to the
inlinee's dispatch block
- If using eh.selector, inlining should append actions to inlinee's
selector block
- Optimisers should not remove unwind actions nor change their
control flow (unless semantics is preserved)
- If we allow changes, we need to explicitly describe the semantics
or have one to rule them all

** Proposal 2: Dispatch and basic-block markings (P3, P4, F5)

Replace the eh.selector/eh.typeid by a dispatch mechanism, that
explicitly lists the possible catch areas, filters, personality and
belongs to a basic block, that needs an attribute "landingpad" to help
optimisations understand that that block is special for EH (this might
not be strictly necessary).

The general syntax of the dispatch is:

lpad: landingpad
%eh_ptr = tail call i8* @llvm.eh.exception()
dispatch region label %lpad resume to label %unwind
   catches [
     %struct.__fundamental_type_info_pseudo* @_ZTIi, label %ch.int.main
   ]
   personality [i32 (...)* @__gxx_personality_v0]

This dispatch instruction is the last instruction in its block. It
explicitly belongs to that block ("region label %lpad") and resume
unwinding to label %unwind. It catches only INT exceptions (whatever
that means in the source language) and the personality routine that is
going to interpret it during run-time is __gxx_personality_v0.

When optimising, passes should see the catch/clean-up blocks that are
dominated by the lading pad and keep their natural flow. When
inlining, they should be move inside the inlinee and the the "resume
label" should be the inlinee's dispatch landing pad, so the sequence
of actions (and the actions themselves) is kept intact.

The dispatch call can also be attached to the invoke instruction,
though there were some problems with clean-ups (Bill) and it may
clutter the IR by repeating the same dispatch for many invokes in one
single try block.

I see that the %eh_ptr is not used by the dispatch, how does it know
what is the type of exception thrown?

** What was not covered

P1/F1/F2: Are these changes EH-style agnostic? Does it at least work
for Dwarf AND SjLj using the same IR representation? Do we want that
to happen?

F6: If a div instruction inside a basic block without EH unwind
information throws an exception, how does the IR represents that? Do
we create an invoke to a fake function for every instruction that
could throw? Do we put the unwind information in the basic-block? In
the dispatch instruction (like we do for region label)?

** Amount of work to do

I reckon that both changes can be done at the same time. Current work
is being done in the ARM back-end to support EHABI, which should also
be orthogonal to those changes (Anton?).

The inlining changes can be done at any time, no need to change the IR
or anything and the changes can be reused by the second proposal later
on.

The problem is that, to change the IR representation, we need to
change all front-ends that deal with exception handling (clang,
llvm-gcc, ada, python etc), and make the back-end iteratively more
robust to accept the new format, but it'd be hard to quickly
deactivate the old format.

I've seen this thread show up and die a few times, and I'm not sure we
have a pressure to do this at any given time. Do we?

cheers,
--renato

Hi Renato,

Thanks for the summary. John and I have been working a lot on our proposal. It's changed significantly since I wrote about it last. It encompasses a lot of John's requirements and fixes the main issues. The key is getting enough time to implement the ideas. As you can imagine, we're swamped here. But this issue has not been dropped at all. :slight_smile:

I'm not ready yet to submit the proposal to the LLVM community – it's still a bit rough. Some initial work seems to show that it's not bad and will be easy to implement.

-bw

This sounds like something that came out of a brainstorming session then snuck into the project requirements when it's really a separate issue. I think you can safely ignore it.

Implicit exceptions at the Java bytecode level are independent of how the compiler models trapping instructions. An optimizing compiler will lower bytecode into an IR with explicit control flow--this does not inhibit optimization, it facilitates optimization. Note that branches are no more a barrier to optimization than implicit exceptions. The difference is that an effective optimization pass already needs to handle branches regardless of the source language or whether compiling with exceptions enabled. It may currently be a weak aspect of a few LLVM passes, but that's much easier to fix than modifying the IR.

Some code generators may emit exceptional control flow as a trapping instruction. This obviously requires runtime support but is otherwise language independent. This is best done as late as possible (think instruction scheduling) in a target specific manner. In practice, I've only seen it used for implicit null checks to shave a cycle off some loads and squeeze a tiny amount of performance out of benchmarks. In fact, hardware support for *nontrapping* loads from page zero plus safe speculation would work much better. In my opinion, implicit null checks are not worth the giant support overhead typically caused by a runtime that tries to catch SIGSEGV. I wouldn't expect a user-space signal handler to be used for integer divides, where there's no performance benefit.

-Andy

Hi Renato,

  ** The problems we're trying to solve are:

this reminds me that I promised to post my own list of EH problems...
I just need to find the time!

Ciao, Duncan.

Ah, great! It was something someone asked about and I've only heard
"don't worry about it" but never heard a full explanation, until now!
:wink:

Thanks!
--renato