exact semantics of 'nounwind'

Hi everyone,

Since I'm busy muddying the waters by changing how exception handling works, I thought I should ask for clarification on the exact behaviour of the current 'nounwind' attribute found on functions, calls and invokes.

I was thinking these would be similar to the AA analysis notes like "doesNotAccessMemory" which is a provable property of the function or call site being analyzed. Duncan mentioned that doesNotThrow is actually an important language semantic meaning that an unwind calls a language-defined behaviour such as calling terminate(). What happens in other languages?

Chris and I also couldn't agree on what the semantics ought to be going forward. He suggested having two bits, one to memoize an analysis proving that it can't unwind, and one to mean that an unwind triggers terminate. I happen to think that this ought to be explicitly modelled in the IR by arcing to another BB that calls terminate.

We do agree that we need crystal-clear semantics in the language, so I'm taking it to the mailing list to see what if we can form a consensus.

Nick

Hi,

as a language front end developer I am a bit terrified by any "unwind here will call terminate" semantics in the IR. I'd prefer the LLVM IR to be free from any assumptions about the languages compiled to it and this looks like C++ semantics sneaking into LLVM. Thus I'm under the expression the calling terminate semantics should be implemented by the front end.

Chris and I also couldn't agree on what the semantics ought to be going
forward. He suggested having two bits, one to memoize an analysis
proving that it can't unwind, and one to mean that an unwind triggers
terminate. I happen to think that this ought to be explicitly modelled
in the IR by arcing to another BB that calls terminate.

Having two bits might probably be a viable compromise.

Just out of interest: how much does a will-not-throw-flag improve the generated code?

Jan

Hi,

as a language front end developer I am a bit terrified by any "unwind
here will call terminate" semantics in the IR.

in fact there aren't any. Suppose you mark a function call with the
'nounwind' flag. What happens is that when the code generators output
the dwarf exception handling info into the object file (basically a
sequence of instruction address ranges plus a "handler" address to jump
to if an exception unwinds through an instruction in that range), they
don't include that call instruction in any addess range. That's always
fine if the call never results in an exception being unwound. If an
exception does try to unwind through that call, what happens depends on
the exception handling personality function. So what is this personality
function and where does it come from? It is provided by the front-end,
which needs to output an llvm.selector intrinsic call in the IR in the
destination block of any invoke call. The llvm.selector intrinsic takes a
pointer to the personality function as an argument. Thus the C++ front-end
provides a pointer to the C++ personality function, the Ada front-end
provides a pointer to the Ada personality function, and your front-end
provides a pointer to your personality function. When an exception unwinds,
the personality function is queried by the unwinder to decide what should
be done. It is the personality function that decides what to do if it
sees that the call is not contained in any address range. The Ada personality
just keeps on unwinding in this case. The C++ personality function calls
terminate. Your personality function makes its own decision. LLVM doesn't
know about what the personality is going to do, and doesn't actually care.
It just provides a means of communication between the front-end code generator
and the personality function (part of the front-end's runtime library) by
noting in the unwind table if the front-end set the nounwind flag on a call.

I'd prefer the LLVM IR
to be free from any assumptions about the languages compiled to it and
this looks like C++ semantics sneaking into LLVM. Thus I'm under the
expression the calling terminate semantics should be implemented by
the front end.

It is - by the front-end's runtime.

> Chris and I also couldn't agree on what the semantics ought to be
> going
> forward. He suggested having two bits, one to memoize an analysis
> proving that it can't unwind, and one to mean that an unwind triggers
> terminate. I happen to think that this ought to be explicitly modelled
> in the IR by arcing to another BB that calls terminate.

Having two bits might probably be a viable compromise.

Just out of interest: how much does a will-not-throw-flag improve the
generated code?

It allows you to eliminate invokes. For example llvm-gcc outputs "sin"
with the nounwind flag set. Thus any call to "sin" will not throw an
exception. Thus (the prune-eh pass does this) any function which only
calls "sin" does not throw any exceptions, and thus can be marked nounwind
too. Thus any function that only calls that function and "sin" cannot
throw exceptions either and can also be marked nounwind. In this way the
nounwind attribute is propagated throughout the call-graph. Then any invoke
that calls one of these nounwind functions can be turned into a normal call,
meaning that the unwind basic block can be discarded leading to further
simplifications and a code size reduction etc. The final effect is quite
significant for Ada, which tends to have a lot of exception handling code.

Ciao,

D.

Hi Nick,

Since I'm busy muddying the waters by changing how exception handling
works, I thought I should ask for clarification on the exact behaviour
of the current 'nounwind' attribute found on functions, calls and invokes.

I was thinking these would be similar to the AA analysis notes like
"doesNotAccessMemory" which is a provable property of the function or
call site being analyzed. Duncan mentioned that doesNotThrow is actually
an important language semantic meaning that an unwind calls a
language-defined behaviour such as calling terminate(). What happens in
other languages?

Chris and I also couldn't agree on what the semantics ought to be going
forward. He suggested having two bits, one to memoize an analysis
proving that it can't unwind, and one to mean that an unwind triggers
terminate. I happen to think that this ought to be explicitly modelled
in the IR by arcing to another BB that calls terminate.

We do agree that we need crystal-clear semantics in the language, so I'm
taking it to the mailing list to see what if we can form a consensus.

the exotic part of nounwind semantics has now been removed (this was that
the nounwind attribute had to be carefully preserved and propagated down
to the codegenerators, which would put a special entry in the dwarf eh
tables, because C++ semantic correctness was depending on the runtime
being informed about nounwind calls), so now it can simply mean: this
has been proved not to throw. And if it does throw, the effect is
undefined.

Ciao,

Duncan.

Duncan Sands wrote:

the exotic part of nounwind semantics has now been removed (this was that
the nounwind attribute had to be carefully preserved and propagated down
to the codegenerators, which would put a special entry in the dwarf eh
tables, because C++ semantic correctness was depending on the runtime
being informed about nounwind calls), so now it can simply mean: this
has been proved not to throw. And if it does throw, the effect is
undefined.

Thanks Duncan! I think this is an improvement!

Nick

Excellent, thanks Duncan!

-Chris