libc++ and large stack frames

This is a proposal to add a configuration option to disable _LIBCPP_INLINE_VISIBILITY, which causes large stack frames in non-optimized builds because it uses the always_inline attribute to force significant amounts of inlining of libc++ code.

The new configuration option could be called _LIBCPP_DISABLE_INLINE_VISIBILITY.

Note that _LIBCPP_INLINE_VISIBILITY and _LIBCPP_ALWAYS_INLINE have identical definitions. One could be renamed to the other or both could be renamed to _LIBCPP_HIDE_FROM_ABI as part of this work.

Alternatives considered and rejected:

  1. Remove attribute((always_inline)) from all include/__config macros.
  2. Add _LIBCPP_DISABLE_ALWAYS_INLINE to control whether or not always_inline is included in any __config macro.
  3. Use the existing _LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS to control both visibility and always_inline.
  4. Use the existing _LIBCPP_ABI_UNSTABLE to control _LIBCPP_INLINE_VISIBILITY.

Further background reading:

The two macros were made identical in the following commit, an effort to reduce the number of exported symbols (Dec 17, 2010):

https://github.com/llvm-mirror/libcxx/commit/2d72b1e393e35d61917d6d0ce069482ab11e96d1

Brief documentation about these macros was added later (Sep 15, 2016):

https://github.com/llvm-mirror/libcxx/commit/833d644ed1c04346d665193d1b3eac5e3ae68193

The macros were discussed here before:

http://lists.llvm.org/pipermail/cfe-dev/2015-December/046436.html

5. Start using __attribute__((internal_linkage)) instead, so a configuration option isn't necessary. See http://lists.llvm.org/pipermail/cfe-dev/2016-October/051151.html .

-Eli

Yes, that'd solve the problem, but it would also result in duplicate
definitions for all the INLINE_VISIBILITY member functions that don't get
inlined. As a consequence, __attribute__((internal_linkage)) would be a
worse solution than the one being proposed here for those who don't care
about the ABI stability guarantees that libc++ is trying to provide.
Replacing the current use of __always_inline__ with __internal_linkage__
seems like it would probably be an improvement to me (conditional on
whether the effects on code size are acceptable), but I think the change
Erik is proposing and that one are largely orthogonal.

I think it’s useful to have different names for these, since they have different intent. It would be reasonable to change the definition of one without changing the other.

Renaming _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI makes sense to me though.

One option we’d like to investigate (but haven’t yet) is changing _LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI to drop the “always_inline” but keep the “hidden”. Have you considered _LIBCPP_DISABLE_ALWAYS_INLINE_IN_INLINE_VISIBILITY (or equivalent)? If not, why not?

I agree.

I think it’s useful to have different names for these, since they have different intent. It would be reasonable to change the definition of one without changing the other.

Renaming _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI makes sense to me though.

One option we’d like to investigate (but haven’t yet) is changing _LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI to drop the “always_inline” but keep the “hidden”. Have you considered _LIBCPP_DISABLE_ALWAYS_INLINE_IN_INLINE_VISIBILITY (or equivalent)? If not, why not?

IIRC that results in missing definitions for all of the explicitly instantiated template members, because the versions in the DSO are not visible to their consumers.

It seems wrong to have any explicitly instantiated template members with _LIBCPP_INLINE_VISIBILITY. Do we really do that?

Removing always_inline from _LIBCPP_INLINE_VISIBILITY was actually the first thing I tried. I got errors like these (from ld):

hidden symbol ‘_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEpLEPKc’ is not defined locally

Okay, I see; we’re opting specific symbols out of living in the dylib. Once you remove the always_inline, the extern template promises that these symbols will be in the libc++.dylib, but they’re not actually available there…

I don’t think we should give up on hiding these symbols from the ABI though, and adding weak external symbols is not a great outcome. I’d rather we add an attribute (straw man: “no_extern_template”) that opts these symbols out of any extern template declaration.

Then, depending on a macro setting, we pick between:

  • attribute((hidden,always_inline))
  • attribute((hidden,no_extern_template))

Also, I think we can use semantic names for the macros.

  • _LIBCPP_HIDE_FROM_ABI to apply the attributes (replacing _LIBCPP_INLINE_VISIBILITY, as you suggested).
  • _LIBCPP_DISABLE_HIDE_FROM_ABI_FROM_OBJECTS (to switch from always_inline (maybe, eventually, internal_linkage) to no_extern_template).

I wonder if the correct approach here might be to add yet another new visibility macro specifically for externally instantiated symbols
which are currently hidden but might want to be used with _LIBCPP_LARGE_CODEBASE.

  1. Introduce a new visibility macro _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY (Technically this macro already exists, but I think we should repurpose it and change existing users to something else).

  2. Mark the symbols in std::string which _LIBCPP_LARGE_CODEBASE doesn’t want inlined using _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY.

  3. When building the dylib define _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY to attribute((always_inline, visibility(“default”))).

Then, by default _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY can be defined to _LIBCPP_INLINE_VISIBILITY, which should ensure the external definitions are not used.
If a user wants to use the external definitions and prevent inlining they could define _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY to simply “attribute((visibility(“default”)))”

I think this should work without doing anything too drastic.

I think it’s useful to have different names for these, since they have different intent. It would be reasonable to change the definition of one without changing the other.

Renaming _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI makes sense to me though.

One option we’d like to investigate (but haven’t yet) is changing _LIBCPP_INLINE_VISIBILITY/_LIBCPP_HIDE_FROM_ABI to drop the “always_inline” but keep the “hidden”. Have you considered _LIBCPP_DISABLE_ALWAYS_INLINE_IN_INLINE_VISIBILITY (or equivalent)? If not, why not?

IIRC that results in missing definitions for all of the explicitly instantiated template members, because the versions in the DSO are not visible to their consumers.

It seems wrong to have any explicitly instantiated template members with _LIBCPP_INLINE_VISIBILITY. Do we really do that?

Removing always_inline from _LIBCPP_INLINE_VISIBILITY was actually the first thing I tried. I got errors like these (from ld):

hidden symbol ‘_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEpLEPKc’ is not defined locally

Okay, I see; we’re opting specific symbols out of living in the dylib. Once you remove the always_inline, the extern template promises that these symbols will be in the libc++.dylib, but they’re not actually available there…

I don’t think we should give up on hiding these symbols from the ABI though, and adding weak external symbols is not a great outcome. I’d rather we add an attribute (straw man: “no_extern_template”) that opts these symbols out of any extern template declaration.

Then, depending on a macro setting, we pick between:

  • attribute((hidden,always_inline))
  • attribute((hidden,no_extern_template))

This surely affects only a tiny minority of the symbols currently marked always_inline, right? Could we start by removing it from all the functions which aren’t affected by the _LIBCPP_EXTERN_TEMPLATE or _LIBCPP_EXTERN_TEMPLATE_TYPE_VIS, first, to get the largest volume of stuff fixed, before trying to deal with these problematic edge cases?

Also – is there really an actual benefit of hiding these symbols from the export list? Once you’re exporting any significant functions from a nontrivial class like basic_string, I’m not clear as to how it’s of real benefit to hide other functions like operator+=. I’d think that typically everything generated by the explicit template instantiation should just be exported in future versions.

Of course there is the Apple-platform issue of requiring compatibility of apps compiled with new libc++ headers to be able to run against an old libc++ shared library. When adding new functions to an existing template class, those functions won’t be present in the older shared lib version. ISTM that for that case, it’d be entirely reasonable to mark them internal_linkage (or whatever) only when targeting old platforms.