[RFC] Better UX for Clang's unwind-affecting attributes

Hello all.

In C++, there are 3 possible behaviors when the
exception escapes out an function that can not unwind:

  1. exception propagates into function’s caller
  2. defined behavior of immediate program termination
  3. the wild UB case, behavior is undefined.

Let’s look at obvious examples:

  1. exception propagates into caller, is caught there, and all is good: Compiler Explorer
  2. exception can not exit noexcept function, program is terminated: Compiler Explorer

Now, the third case, the wild UB case, is the most interesting one.
There are 3 clang/gcc attributes that are relevant here, let’s look at them:

  1. __attribute__((pure)): Compiler Explorer, there the fun begins.
    In clang, we get UB, in gcc we get “exception propagates into function’s caller”
  2. __attribute__((const)): Compiler Explorer, same as __attribute__((pure))
  3. __attribute__((nothrow)): Compiler Explorer, the behavior is consistently
    defined as immediate program termination. I do not understand why it was defined as such,
    in the sense of how is that different from plain noexcept, but at least we are consistent.

Now, there are 3 problems:

  1. Our modelling of __attribute__((const))/__attribute__((pure)) differs from that of GCC, we add UB.
  2. We can not ask for __attribute__((const))/__attribute__((pure)) behavior,
    without it acting as exception barrier.
  3. We can not separately ask for the exception propagation to be UB.
    This would be a handy optimization tool, especially given how brittle our IRGen
    for the case 2. (program termination) is.

Proposal:

  1. Match GCC’s implementation-defined behavior on __attribute__((pure))/__attribute__((const))
    – they should not cause UB on exception escape, nor should they cause
    immediate program termination, exceptions should be free to escape into their caller.
  2. Introduce __attribute__((nounwind)), which would be lowered into LLVM IR’s nounwind attribute,
    and if an exception escapes out of an function marked with such an attribute, wild UB happens.

Please note, while currently such an UB is indeed not sanitized,
i have a patch in progress to handle it:
⚙ D137381 [clang][compiler-rt] Exception escape out of an non-unwinding function is an undefined behaviour,
so we would not be introducing something that is impossible to deal with.

Roman.

CC’ing @AaronBallman @erichkeane @rjmccall @jyknight

(I’ve not formed any opinions yet, but responding with some information.)

From the GCC docs on pure:

The pure attribute prohibits a function from modifying the state of the program that is observable by means other than inspecting the function’s return value.

If a function is marked pure but throws an exception, it does not meet this requirement because of the global exception pointer object, which is observable by means other than inspecting the function’s return value. I think Clang’s behavior here is defensible (whether we should differ from GCC in how we handle the UB is another matter).

The docs on const are a bit less clear. The sentences:

Calls to functions whose return value is not affected by changes to the observable state of the program and that have no observable effects on such state other than to return a value may lend themselves to optimizations such as common subexpression elimination.

and

Because a const function cannot have any observable side effects it does not make sense for it to return void . Declaring such a function is diagnosed.

sure make it sound like changes to global state (such as the global exception pointer) are also UB.

We explicitly document that they’re equivalent: Attributes in Clang — Clang 16.0.0git documentation

We should be paying very close attention to Unsequenced functions which was adopted for C2x. The new attributes proposed there are very close to, but not quite the same as, the const and pure attributes. Matching GCC’s behavior for those attributes might be a reasonable goal, but we should be careful to ensure we’re still able to support the attributes from the standard.

FWIW, the two bullet points of proposal are separate-able,
if we want to keep UB on const/pure (regardless of whether or not
that matches GCC), that fine, but i do think we need
that new nounwind attribute regardless.

I dunno, setting the thread-local exception pointer is arguably not a side effect in the same way. The caller cannot in fact observe it unless it has EH landing pads.

It’s better to understand throwing as a first-class alternative to returning than to try to fit it into the box of a side effect. You could argue that GCC’s documentation still precludes that, though, as a function that throws does not return.

Does GCC actually limit its optimization of pure/const calls in a way that behaves properly under the possibility of thrown exceptions, or is it just maintaining the technical ability to unwind without considering the impact on which cleanups to do? I’m not sure there’s an easy way to run that experiment; we’d need to find a place where GCC actually moves a call (e.g. hoisting it out of a loop) rather than just eliding calls in favor of an existing one (in which case preserving the EH context of the original call is the right thing to do).

Apparently in Ada, pure functions are allowed to throw, and this should work normally if the call is not elided, but that possibility should not affect whether the call is elided.

My understanding of the GCC consensus is that pure/const attributes should follow that specification, though I’m not sure what the exact interaction is between that and -fdelete-dead-exceptions currently.

2 Likes

Draft patch implementing proposed RFC:
https://reviews.llvm.org/D138958
I’m probably missing some tests, but this is generally it.
I’ve added some FIXME’s in places where it isn’t obvious what we should be doing,
or that don’t //need// to be fixed right away.

Thanks for the info Jason! Hopefully the docs can be updated once the details have been hashed out.

@rjmccall while there, i have a standardese question.

Given the UB we are “introducing” (more like, adding a user-facing interface for),
and the C++'s standard’s spirit we discussed in
⚙ D137381 [clang][compiler-rt] Exception escape out of an non-unwinding function is an undefined behaviour,
would it be correct to say that essentially, the observable side-effects have two realms,
the normal execution one, and the realm for exceptions/unwinding,
and, as far as the normal realm is concerned, the side-effects that are
happening in exceptions/unwinding realm, are not observable in normal realm,
and the exception itself is not a side-effect in the exceptions/unwinding realm,
if it’s handling exhibits UB?

IOW, if an exception is thrown, and not caught, and can’t unwind,
can it time-travel to the moment of it’s inception,
and prevent itself from being thrown in the first place?

IOW, is the following transformation legal:

#include <array>
#include <cstdio>
#include <stdexcept>
#include <cassert>

void opaque_willreturn_call();
void some_call();

__attribute__((nounwind)) void src(bool ShouldThrow) {
    some_call();
    if(ShouldThrow) {
        opaque_willreturn_call();
        static constexpr size_t bufSize = 8192;
        static thread_local std::array<char, bufSize> buf;
        printf(buf.data(), sizeof(buf), "Text");
        throw std::runtime_error(buf.data());
      } else {
        printf("What exception? I saw no exception\n");
    }
}

__attribute__((nounwind)) void tgt(bool ShouldThrow) {
    some_call();
    assert(!ShouldThrow);
    printf("What exception? I saw no exception\n");
}

This is interesting. Do you have a reference for the purity of Ada functions? All I was able to find in my own research is that, early on, there was an idea that Ada functions were necessarily pure because of language restrictions: Ada (in its first revisions) was pretty locked down about indirection, and functions were required to take “in” parameters (i.e. pass-by-value), so some people thought that that meant that they were pure. But Ada always had global variables, and of course it also must have had side-effectful procedures that functions could theoretically call. I can’t find any reference saying that functions had a semantic limitation of purity, or any specification for attributes that would impose that restriction.

But even if they were limited to be pure, it makes sense that they would still be allowed to throw, which works just fine under a formalization where throwing from a function simply produces a different color of result. (Also, all sorts of things are specified as throwing in Ada, including running off the end of a function without returning. It would be very un-Ada-like to restrict throwing.) And there’s nothing particularly problematic with that as long as the optimizer behavior for purity is limited to CSE / DCE and not arbitrary code motion.

It would arguably be correct to assume that a throw site in a throwing-is-UB context is unreachable, yes. That’s the sort of optimization that tends to lead to users coming at us with pitchforks, though, and in practice it’s unlikely to be beneficial (I guess we could eliminate bounds checks in something like vector.at?) and likely to cause major debugging problems.

I don’t think this introduces some sort of two-sided analysis of side effects. The control flow of exceptions is complex and has never been as simple as “throws go to the innermost landing pad”. So it’s not that side effects associated with landing pads aren’t real, it’s that it’s not easily determinable whether landing pads are actually entered and thus whether those side effects happen. Think of landing pads as a sort of callback — who knows if the callback will be called or not?

@rjmccall thank you!

Aha! :slight_smile:

Good thing we aren’t just introducing this UB, but also providing tools to deal with it. :slight_smile:

Yes, that would be the idea. This isn’t much different from normal
‘“dead” code post-dominated by underachable’ elimination.