[RFC] Suppress C++ static destructor registration

Hi,

C++ static destructors can be problematic in multi-threaded
environment. Some of the issues users often complain about include:
1. Teardown ordering: crashes when one thread is exiting the process
and calling destructors while another thread is still running and
accessing the destructing variables
2. Shared code that is compiled both as an application and as a
library. When library mode is chosen, goto (1).
3. Some projects currently override __cxa_atexit to avoid the behavior
in question.

To get around that, I propose we add a compiler option (e.g.
-fno-cxx-static-destructors) to allow clang to suppress destructor
registration (currently done via __cxa_atexit, atexit):
https://reviews.llvm.org/D22474

I'm opening this discussion here on cfe-dev to get some feedback on the matter

One can argue that dealing with C++ static destructors in
multi-threaded environment is solely the responsibility of the
developer, however since (AFAIK) we don't have any standard guaranteed
semantic for "global destruction vs. threads", it seems fair to me
that we could give developers some option.

Cheers,

They already have options. They can use std::quick_exit, which was added
specifically to address this problem, if they don't want destructors to be
run at all. There are standard techniques to avoid destructors being run
for specific objects:

  template<typename T> union not_destroyed {
    T value;
    template<typename ...U> constexpr not_destroyed(U &&...u) :
value(std::forward<U>(u)...) {}
    ~not_destroyed() {}
  };
  not_destroyed<std::string> my_str("foo"); // technically has object
lifetime issues

  ... or ...

  std::string &&s = *new std::string("foo");

Are these options not good enough? Is per-TU control (with no source
changes) a goal here, or is it more of an incidental property of the
solution? It seems to me that we should prefer to either push people
towards the standard std::quick_exit solution or propose an alternative
standard mechanism rather than invent our own proprietary way to work
around this problem.

There are guaranteed semantics for global destruction for well behaved programs. First, thread locals are destroyed in LIFO order. Then function level statics are destroyed in LIFO order. Then global and class statics are destroyed.

Here are some stackoverflow responses (with links to standardese) that can be useful:

One area that I'm not clear on (and maybe this is what prompted your post) is what happens with regards to detached threads and global destruction. To be honest, I think those programs (and detached threads in general) are broken for clean program termination use cases.

I'm not opposed to adding -fno-cxx-static-destructors... but I disagree with your current justifications. Provide some concrete examples.

Hi,

C++ static destructors can be problematic in multi-threaded
environment. Some of the issues users often complain about include:
1. Teardown ordering: crashes when one thread is exiting the process
and calling destructors while another thread is still running and
accessing the destructing variables
2. Shared code that is compiled both as an application and as a
library. When library mode is chosen, goto (1).
3. Some projects currently override __cxa_atexit to avoid the behavior
in question.

To get around that, I propose we add a compiler option (e.g.
-fno-cxx-static-destructors) to allow clang to suppress destructor
registration (currently done via __cxa_atexit, atexit):
⚙ D22474 [CodeGen] Suppress C++ static destructor registration

I'm opening this discussion here on cfe-dev to get some feedback on the
matter

One can argue that dealing with C++ static destructors in
multi-threaded environment is solely the responsibility of the
developer, however since (AFAIK) we don't have any standard guaranteed
semantic for "global destruction vs. threads", it seems fair to me
that we could give developers some option.

They already have options. They can use std::quick_exit, which was added
specifically to address this problem, if they don't want destructors to be
run at all. There are standard techniques to avoid destructors being run
for specific objects:

  template<typename T> union not_destroyed {
    T value;
    template<typename ...U> constexpr not_destroyed(U &&...u) :
value(std::forward<U>(u)...) {}
    ~not_destroyed() {}
  };
  not_destroyed<std::string> my_str("foo"); // technically has object
lifetime issues

  ... or ...

  std::string &&s = *new std::string("foo");

Are these options not good enough? Is per-TU control (with no source
changes) a goal here, or is it more of an incidental property of the
solution? It seems to me that we should prefer to either push people
towards the standard std::quick_exit solution or propose an alternative
standard mechanism rather than invent our own proprietary way to work
around this problem.

This doesn't sound like Bruno's use case, but at least for games, we know
they never exit (or when they do they don't care about destructors).
Avoiding emitting destructors can give a size benefit in such scenarios.
Also, it can theoretically (I have not measured this, but it doesn't sound
far-fetched) eliminate some references to global variables which in an LTO
context can allow for more aggressive optimizations.

-- Sean Silva

std::quick_exit() does not help. The destructor is in a library. The library author has no control over how other code in the process calls exit(). The authors of the app and other libraries are unaware that exit() is dangerous.

This is fragile. It’s easy to accidentally define a static variable that does not have this template, thereby breaking exit() again.

-Werror=exit-time-destructors complains about ~not_destroyed(), so it can’t help. Adding #pragma diagnostic around every use of not_destroyed would fix that and not be fragile, but it would be awfully ugly.

A post-link test for references to symbol __cxa_atexit might help, but only if there are no intentional static destructors anywhere and only for optimized builds.

This didn’t compile.

test.cxx:12:26: error: rvalue reference to type ‘basic_string<[3 * …]>’ cannot bind to lvalue of type ‘basic_string<[3 * …]>’

Concrete example:

The Objective-C runtime has a global table that stores retain counts. Pretty much all Objective-C code in the process uses this table. With global destructors in place this table is destroyed during exit(). If any other thread is still running Objective-C code then it will crash.

Currently the Objective-C runtime avoids the destructor by initializing this table using placement new into an aligned static char buffer.

–Greg Parker gparker@apple.com Runtime Wrangler

That sounds like a nice idea. The standards-required behavior of tearing down globals is insane, and it'd sure be nice to be able to easily opt out of the insanity. It'd be nicest if it was supported in both GCC and Clang with the same flag, of course. :slight_smile:

std::quick_exit is just about worthless, because there's just about no practical way that you can ever get all the code linked into your binary to stop calling exit().

IMO, the standard should just entirely remove the utterly broken global destructors "feature".

In the "interim", a compiler flag is nice because you can set it in your build system, and all your existing code just starts working better (where better = fewer random unreproducible crashes on shutdown when someone accidentally used a global with a destructor). And you don't ever have to worry about this problem again...

That sounds like a nice idea. The standards-required behavior of tearing
down globals is insane, and it'd sure be nice to be able to easily opt out
of the insanity. It'd be nicest if it was supported in both GCC and Clang
with the same flag, of course. :slight_smile:

std::quick_exit is just about worthless, because there's just about no
practical way that you can ever get all the code linked into your binary to
stop calling exit().

Recompiling all the code linked into your binary with a custom compiler
flag is also not the most practical thing to ask of people. Maybe a more
practical approach would be to register an atexit handler that calls
quick_exit? :slight_smile:

IMO, the standard should just entirely remove the utterly broken global
destructors "feature".

Perhaps so, but creating a non-standard dialect of C++ with different
semantics has historically proven to not be a viable way to get that
effect. This change will create portability headaches down the line, as
other flags to turn off parts of standard C++ semantics have done
(-fno-rtti, -fno-exceptions). And evidence from those suggests that the
standard never does actually get fixed, and each vendor ends up with a
slightly different mode.

If someone's prepared to write up a paper for the C++ committee exploring
what it would mean to standardize this behaviour and actually fix the
problem for everyone, it would seem extremely reasonable for clang trunk to
carry a flag to enable that behaviour (and it certainly doesn't have to
wait for the committee to respond).

In the "interim", a compiler flag is nice because you can set it in your

build system, and all your existing code just starts working better (where
better = fewer random unreproducible crashes on shutdown when someone
accidentally used a global with a destructor). And you don't ever have to
worry about this problem again...

... until you try to use a library that depends on a global destructor
running on program shutdown :slight_smile: For previous similar flags, there was a
small but perpetual cost paid by every compiler vendor and by users, for
each of these features that got into mainstream use -- and I expect this
*will* enter mainstream use (it would be a very convenient solution to a
selection of important problems).

That sounds like a nice idea. The standards-required behavior of tearing
down globals is insane, and it'd sure be nice to be able to easily opt out
of the insanity. It'd be nicest if it was supported in both GCC and Clang
with the same flag, of course. :slight_smile:

std::quick_exit is just about worthless, because there's just about no
practical way that you can ever get all the code linked into your binary to
stop calling exit().

Recompiling all the code linked into your binary with a custom compiler
flag is also not the most practical thing to ask of people. Maybe a more
practical approach would be to register an atexit handler that calls
quick_exit? :slight_smile:

IMO, the standard should just entirely remove the utterly broken global
destructors "feature".

Perhaps so, but creating a non-standard dialect of C++ with different
semantics has historically proven to not be a viable way to get that
effect. This change will create portability headaches down the line, as
other flags to turn off parts of standard C++ semantics have done
(-fno-rtti, -fno-exceptions).

To be clear, even on PS4, we would not default this flag to on, as there
are use cases that do exit. It would just be something that we suggest
users to try if it makes sense for them.

-- Sean Silva

I also don’t like subsetting the language even more (-fno-exceptions, -fno-rtti, -fno-sized-allocation, etc), but this is another one of those C++ features that really isn’t “pay for what you use”.

The simplest way to avoid paying for static destruction is to turn it off completely.

I also don’t like subsetting the language even more (-fno-exceptions, -fno-rtti, -fno-sized-allocation, etc), but this is another one of those C++ features that really isn’t “pay for what you use”.

The simplest way to avoid paying for static destruction is to turn it off completely.

This seems exactly like pay for what you use. If you have no static destructors, you don’t pay anything. It’s just easy to use this feature accidentally, and when you probably shouldn’t.

With RTTI and exceptions, you get object code increases even if your code can’t possibly throw, and never uses dynamic_cast or typeid. To the best of my knowledge, there is no code size increase if you don’t have a static destructor.

I can see this as a usability improvement. Say I have a class that is generally not a global, and it needs a ctor and dtor for those cases. Now I want the ctor to run at program / dynamic library initialization. This flag can make that easier to do, without dragging along the dtor as well. It has maintenance costs though.

I will agree that this has a similar feeling to the issues behind std::thread’s destructor problems (terminate, join, or detach?). In a single threaded world, running static destructors seems like a reasonable thing to do, and generally doesn’t introduce stability problems. In a multi-threaded world, there doesn’t seem to be a safe thing to do in the general case. Destroying the objects cause problems. We can’t wait for all the other threads to finish, we can’t terminate those threads, we can’t freeze them. The standard doesn’t talk about shared libraries either, so that makes things worse.

_______________________________________________
cfe-dev mailing list
[cfe-dev@lists.llvm.org](mailto:cfe-dev@lists.llvm.org)
[http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev](http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev)

std::quick_exit() does not help. The destructor is in a library. The library author has no control over how other code in the process calls exit(). The authors of the app and other libraries are unaware that exit() is dangerous.

This is fragile. It’s easy to accidentally define a static variable that does not have this template, thereby breaking exit() again.

-Werror=exit-time-destructors complains about ~not_destroyed(), so it can’t help. Adding #pragma diagnostic around every use of not_destroyed would fix that and not be fragile, but it would be awfully ugly.

I think not_destroyed could be written differently so as not to have a non-trivial dtor. Make it more like std::optional (wrapping an object in a pointer-like API) & just has a byte buffer member and no dtor declared. Then it’d be trivially destructible and have no global dtor and be -Wexit-time-destructors clean.

Hi,

C++ static destructors can be problematic in multi-threaded
environment. Some of the issues users often complain about include:
1. Teardown ordering: crashes when one thread is exiting the process
and calling destructors while another thread is still running and
accessing the destructing variables
2. Shared code that is compiled both as an application and as a
library. When library mode is chosen, goto (1).
3. Some projects currently override __cxa_atexit to avoid the behavior
in question.

To get around that, I propose we add a compiler option (e.g.
-fno-cxx-static-destructors) to allow clang to suppress destructor
registration (currently done via __cxa_atexit, atexit):
⚙ D22474 [CodeGen] Suppress C++ static destructor registration

I'm opening this discussion here on cfe-dev to get some feedback on the
matter

One can argue that dealing with C++ static destructors in
multi-threaded environment is solely the responsibility of the
developer, however since (AFAIK) we don't have any standard guaranteed
semantic for "global destruction vs. threads", it seems fair to me
that we could give developers some option.

They already have options. They can use std::quick_exit, which was added
specifically to address this problem, if they don't want destructors to be
run at all.

std::quick_exit() does not help. The destructor is in a library. The
library author has no control over how other code in the process calls
exit(). The authors of the app and other libraries are unaware that exit()
is dangerous.

There are standard techniques to avoid destructors being run for specific
objects:

  template<typename T> union not_destroyed {
    T value;
    template<typename ...U> constexpr not_destroyed(U &&...u) :
value(std::forward<U>(u)...) {}
    ~not_destroyed() {}
  };
  not_destroyed<std::string> my_str("foo"); // technically has object
lifetime issues

This is fragile. It's easy to accidentally define a static variable that
does not have this template, thereby breaking exit() again.

-Werror=exit-time-destructors complains about ~not_destroyed(), so it
can't help. Adding #pragma diagnostic around every use of not_destroyed
would fix that and not be fragile, but it would be awfully ugly.

I think not_destroyed could be written differently so as not to have a
non-trivial dtor. Make it more like std::optional (wrapping an object in a
pointer-like API) & just has a byte buffer member and no dtor declared.
Then it'd be trivially destructible and have no global dtor and be
-Wexit-time-destructors clean.

Clang seems to have no problem removing the call to the empty dtor at -O1.
Using a byte buffer can have other problems, like losing the possibility of
constant initialization.

Making correctness dependent on optimizations is a bit tricky (though since the dtor does nothing, it’s not actually a correctness issue - it’s a global dtor that can’t really race with anything, etc). But that’s hard to detect in general - so we settle for a fairly blunt/defensible position of “not actually executing any code”.

Sure enough.