help me understand how nounwind attribute on functions works?

from http://llvm.org/docs/LangRef.html:

nounwind
This function attribute indicates that the function never raises an exception. If the function does raise an exception, its runtime behavior is undefined. However, functions marked nounwind may still trap or generate asynchronous exceptions. Exception handling schemes that are recognized by LLVM to handle asynchronous exceptions, such as SEH, will still provide their implementation defined semantics.

Some things I noticed by looking at various C test programs with clang -S -emit-llvm:

  • If a function definition is provided, it gets nounwind, even if it calls a non-nounwind function
  • If a function is external (no definition provided), it does not get nounwind

What I don’t understand is:

  • A non-nounwind function that calls a nounwind function, wouldn’t the ability for an exception to be thrown transfer? I thought if you don’t catch an exception, it bubbles up to the caller and so on.
  • More importantly, if compiling in C mode, why doesn’t every function have nounwind? I thought that C does not have exceptions, so calling a function and having it throw an exception would be undefined behavior anyway. Wouldn’t that mean every function should be nounwind?

Regards,
Andrew

I think this behavior is intended to allow better LTO between C and C++. Consider this kind of code:

// foo.h
extern “C” int doThing(bool canThrow);
// foo.cpp
int doThing(bool canThrow) {

if (hadError) {
if (canThrow) throw MyException; else return -1;
}
}
// bar.c
#include “foo.h”
void f() {
doThing(false); // don’t declare doThing as nounwind
}
// baz.cpp

doThing(true);

Basically, compiling a C declaration for a function is not an assertion that it never throws an exception. However, defining a function in a C TU implies that it will not throw an exception.

What isn’t clear to me still is : why shouldn’t this be transitive?
In the example you’re showing, for a caller of f() in bar, what is the advantage of knowing that f() is nounwind if it an exception can still be thrown? What does it allow?

We know an exception cannot unwind out of f. An exception can be thrown
inside something that f calls, but it must be caught before it unwinds
beyond f.

Why? With the x86-64 ABI, for example, f is required to emit async unwind tables. It has a personality function that knows how to handle cleanups, but there are none here, so the generic unwinder will happily unwind through the code. The only ways to prevent exceptions from being propagated are to explicitly catch them, or to use something like a C++ exception specifier so that the personality function will explicitly block exception propagation.

In this example, given that doThing is not marked as noexcept, we will emit unwind tables that explicitly allow unwinding through f; however, when the unwinder tries to unwind into the next stack frame it will find that there is no unwind record. Depending on how lazy we are in defining the PC ranges in the unwind table, this will either cause stack corruption or a run-time abort from well-formed code.

David

It is my belief that all functions compiled with -fno-exceptions (which is the default for C code, but notably not the ONLY option) ought to (for QoI reasons, not a standards requirement) get the equivalent of noexcept behavior. That is: guaranteed to abort if an exception is thrown through it.

This doesn’t happen now on x86-64, due to the default async unwind tables as you mention. That used to be the case on x86-32…although it seems that async unwind tables are now on by default there too, in some cases.

Unfortunately, clang implements noexcept very inefficiently, and so doing that in clang right now would have a huge amount of overhead.

Clang implements C++ noexcept by inserting explicit catch code, which then calls terminate. GCC, on the other hand, just sets up the exception table appropriately to make the unwinder itself do that. It would be really good to fix that, but it’s my understanding that it’d be somewhat difficult in llvm’s current exceptions model.

I’ll defend the LLVM IR representation that was chosen for noexcept. The design of LLVM’s EH constructs is all about representing EH metadata as real instructions with real semantics that normal program transformations can analyze. LLVM doesn’t have any EH side tables, it’s all part of the IR. This is a huge design win.

This includes terminating after noexcept. It’s a very literal implementation: if an exception is thrown from any call site in a noexcept function, then we call terminate. This is easy to analyze, and most importantly, easy to inline. We don’t need to reason about wacky function attributes conjuring up new calls to target-specific runtime functions (i.e. what is std::terminate called on your platform) from the middle-end. If we represented noexcept as data or a function attribute, inlining would not be a simple matter of chaining return to the invoke/call successor and resume to the invoke unwind destination.

For Windows EH, we had to walk some things back to turn some things that were code back into data, and by and large that has been a Bad Thing for the middle end. It likes code. It’s easy for GVN to reason about llvm.eh.typeid.for and eliminate duplicate catch clauses. We miss some of these optimizations on Windows as a result today.

Considering all that, I really think that the committee made a mistake to make throwing through noexcept terminate the program. Making it UB would have been fine. We could have done as you suggest, where throwing from a non-inlined nounwind function terminates the program as a QoI matter without pessimizing inlining of noexcept functions.

The world as it is today kind of sucks. I would say that we should just pattern match away our calls to std::terminate in the backend and emit the more compact tables, but that is actually a behavior change. It will cause cleanups between the thrown exception and the noexcept function to stop running. Changing that behavior would require an opt-out mechanism. That’s not a big deal, but what it really requires is someone who cares. Maybe you can be that person. :slight_smile:

Given the semantics, aren’t we allowed to infer it transitively? I.e. if a callee of a nounwind function does dynamically unwind through the nounwind caller, then it is UB.

– Sean Silva

I'll defend the LLVM IR representation that was chosen for noexcept. The
design of LLVM's EH constructs is all about representing EH metadata as
real instructions with real semantics that normal program transformations
can analyze. LLVM doesn't have any EH side tables, it's all part of the IR.
This is a huge design win.

This includes terminating after noexcept. It's a very literal
implementation: if an exception is thrown from any call site in a noexcept
function, then we call terminate. This is easy to analyze, and *most*
importantly, easy to inline. We don't need to reason about wacky function
attributes conjuring up new calls to target-specific runtime functions
(i.e. what is std::terminate called on your platform) from the middle-end.
If we represented noexcept as data or a function attribute, inlining would
not be a simple matter of chaining return to the invoke/call successor and
resume to the invoke unwind destination.

For Windows EH, we had to walk some things back to turn some things that
were code back into data, and by and large that has been a Bad Thing for
the middle end. It likes code. It's easy for GVN to reason about
llvm.eh.typeid.for and eliminate duplicate catch clauses. We miss some of
these optimizations on Windows as a result today.

Considering all that, I really think that the committee made a mistake to
make throwing through noexcept terminate the program. Making it UB would
have been fine. We could have done as you suggest, where throwing from a
non-inlined nounwind function terminates the program as a QoI matter
without pessimizing inlining of noexcept functions.

The world as it is today kind of sucks. I would say that we should just
pattern match away our calls to std::terminate in the backend and emit the
more compact tables, but that is actually a behavior change. It will cause
cleanups between the thrown exception and the noexcept function to stop
running. Changing that behavior would require an opt-out mechanism. That's
not a big deal, but what it really requires is someone who cares. Maybe you
can be that person. :slight_smile:

I don't mind changing that behavior, and the opt-out mechanism can just be
to use a catch in the source.

The easiest way to do this pattern-matching would be to just allow
landingpad to encode that a handler is a catch-all terminate handler, and
say that the semantics of that allow the handler to not actually be
executed. We do fairly similar things already with filters, I think.

John.

So perhaps a viable rule is that every CallInst in a nounwind function
can be marked as nounwind (even though the callee for said CallInst
can't be)?

-- Sanjoy

That should be an implicit assumption when a given function has the attribute. A `isCallNounwind(CallSite &C)` should be allowed to be implemented conceptually: return C.getCaller()->hasNounwindAttr() || C.getCaller()->hasNounwindAttr();

I’m still not sure what is LLVM doing differently for such calls thought? Why is it useful to know that a call is nounwind?
I thought it is only useful to be able to turn invoke into calls, but what else?

Thanks,

What isn’t clear to me still is : why shouldn't this be transitive?
In the example you’re showing, for a caller of f() in bar, what is the
advantage of knowing that f() is nounwind if it an exception can still be
thrown? What does it allow?

We know an exception cannot unwind out of f. An exception can be thrown
inside something that f calls, but it must be caught before it unwinds
beyond f.

So perhaps a viable rule is that every CallInst in a nounwind function
can be marked as nounwind (even though the callee for said CallInst
can't be)?

That should be an implicit assumption when a given function has the attribute. A `isCallNounwind(CallSite &C)` should be allowed to be implemented conceptually: return C.getCaller()->hasNounwindAttr() || C.getCaller()->hasNounwindAttr();

I’m still not sure what is LLVM doing differently for such calls thought? Why is it useful to know that a call is nounwind?
I thought it is only useful to be able to turn invoke into calls, but what else?

Because nounwind functions do not have an implicit throw edge, you can
do things like:

call @readnone_nounwind()
int k = a / b;

==>

int k = a / b;
call @readnone_nounwind()

Unfortunately the readnone (or readonly) is important in LLVM today;
since the informal semantics today are readnone or readonly functions
can't exit(0) or infloop.

-- Sanjoy

Thanks for the example!