[libc++] A proposal for getting rid of __always_inline__ in _LIBCPP_INLINE_VISIBILITY

Hi,

There's been a lot of discussion about the use of __always_inline__ in libc++, both recently but also going back a long time. I would really like to solve this problem once and for all, so I’ve put together what I hope is a good summary of the situation, along with a proposal which crystallizes prior discussions on this list.

If I understood correctly, the use cases we are trying to support are:

1. An executable can link against DSOs that were built with different libc++ headers than those the executable is built with. This requires something to appear at the ABI boundary of libc++ only if it is ABI stable across versions (symbols, calling conventions, structure sizes, semantics, etc).

2. Two object files that were built using different libc++ headers can be linked into the same final linked image (executable/DSO). This requires anything on the ABI boundary of libc++ to be ABI stable (like (1)), but in addition requires the ABI unstable parts to be private to each object file (to avoid the linker de-duplicating incompatible functions that happen to have the same symbol).

The current solution is to:
- mark anything on the ABI boundary as being externally visible from the dylib (using one of the visibility macros like `_LIBCPP_TYPE_VIS` and `_LIBCPP_FUNC_VIS`), and
- mark most internal functions with `_LIBCPP_INLINE_VISIBILITY`, which forces inlining the function and gives the symbol hidden visibility for dylib visibility purposes, and
- mark some methods of extern templates with `_LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY`, which forces inlining the method, but still includes a copy of the method in the dylib and makes that symbol visible for dylib visibility purposes. This is required for methods that used to be exported in the dylib, but are now defined in the headers — those are still exported from the dylib for backwards compatibility purposes, but they are marked as __always_inline__ so they’re inlined into the caller if the caller is using recent libc++ headers. When the method can’t be inlined, the one from the dylib is used instead.

The problem here is that we're relying on inlining, which leads to debuggability and (potentially) code size problems. I believe we should continue to provide solutions for both (1) and (2), but make (2) opt-out for users that don't care about it.

Concretely, here's a proposal that (I think) achieves all of these goals:

1. Everything that is ABI stable is marked with a visibility macro like today (no change here).
2. Everything that is marked with `_LIBCPP_INLINE_VISIBILITY` today is marked with a new `_LIBCPP_HIDE_FROM_ABI` macro instead. This macro can be defined to `__attribute__((__visibility__("hidden")))` or to `__attribute__((__visibility__("hidden"),internal_linkage))` depending on whether we want to support only use case (1), or both use cases (1) and (2), respectively. This would look something like:

    #ifdef _LIBCPP_NO_INTERNAL_LINKAGE_FOR_ABI_UNSTABLE_SYMBOLS
    # define _LIBCPP_HIDE_FROM_ABI __attribute__((__visibility__("hidden")))
    #else
    # define _LIBCPP_HIDE_FROM_ABI __attribute__((__visibility__("hidden"),internal_linkage))
    #endif

In the future, we can decide which default behavior we want, but for now, I suggest we stick to what we have right now, which is support for both (1) and (2). It would be fine to change this in the future if we make that decision.

3. For the time being, we do NOT touch `_LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY`. This macro only decorates a few (40) small functions, so I don't think that leaving the `__always_inline__` there is a big deal. I’ve considered many potential alternatives to this based on what has been discussed on this list previously, but I think this should be tackled as part of a follow-up proposal because this problem has more subtleties to it. I will followup with a proposal in the future.

I’ve implemented the proposal outlined here and successfully ran all of libc++’s tests on my machine. I've also built libc++.dylib before and after applying my patch, and I verified that the symbols exported by the dylib are the same. For reference, I `diff`ed the output of the following command:

    nm -g libc++.dylib | c++filt | cut -c 18-

Please let me know what you think about this and whether I’ve missed anything important. If we agree on doing this, I’ll put a patch up for review soon.

Cheers,
Louis

P.S.: For reference, here's all the prior discussions I could find on this topic (so you don't have to search like I did):

May/June 2018 (proposal to allow disabling _LIBCPP_INLINE_VISIBILITY):
    http://lists.llvm.org/pipermail/cfe-dev/2018-June/058157.html
    http://lists.llvm.org/pipermail/cfe-dev/2018-May/058143.html

April 2017 (follow-up on Duncan's plan):
    http://lists.llvm.org/pipermail/cfe-dev/2017-April/053388.html

July 2016 (pretty concrete plan by Duncan):
    http://lists.llvm.org/pipermail/cfe-dev/2016-July/049985.html

July 2016 (discussion about internal_linkage, taking addresses and deduplicating functions with internal linkage):
    http://lists.llvm.org/pipermail/llvm-dev/2016-July/102223.html

October 2015 (initial proposal to add the `internal_linkage` attribute to Clang):
    http://lists.llvm.org/pipermail/cfe-dev/2015-October/045580.html

I like the idea of _LIBCPP_HIDE_FROM_ABI, and I think the first step you proposed sounds reasonable.

I like the idea of _LIBCPP_HIDE_FROM_ABI, and I think the first step you proposed sounds reasonable.

I mean exported from the libc++ dylib.

For other readers:
Regarding linkage failures, Eric and I talked privately and he explained that what he was worried about was a situation such as this one, where a symbol with internal linkage is included in an extern template declaration:


// RUN: %cxx -c %s -o %t.first.o %flags %compile_flags
// RUN: %cxx -c %s -o %t.second.o -DWITH_MAIN %flags %compile_flags
// RUN: %cxx -o %t.exe %t.first.o %t.second.o %flags %link_flags
// RUN: %run

template <class T>
class Foo {
public:
__attribute__((internal_linkage)) void bar();
};

template <class T>
void Foo<T>::bar() {}

extern template class Foo<int>;

#ifdef WITH_MAIN
int main() {
Foo<int> f;
f.bar();
}
#else
template class Foo<int>;
#endif

This indeed fails to link because the extern template declaration promises that an instantiation will be available in t.first.o, but that instantiation ends up having internal linkage and hence not being visible to t.second.o. However, if bar is marked inline instead, the example compiles because the C++ Standard requires the instantiation to still happen in t.second.o even though an extern instantiation is promised, per http://eel.is/c++draft/temp.explicit#12. I don’t think it is ever the case that we use _LIBCPP_INLINE_VISIBILITY on a function that is not marked inline in libc++, which is why my proposed solution works in practice. Indeed, if I remove the inline from a function marked as _LIBCPP_INLINE_VISIBILITY in a class that is externally instantiated, I do get a linker error when running libc++’s tests.

So the gist of it is that Eric and I think Eric’s concern is not an issue for libc++, but by not much at all.

The idea here is that some people don’t care about interoperability of static libraries (or, equivalently, object files) built with different libc++ headers. And since they don’t care about this, they’d rather not pay the cost of inlining or internal linkage, which is code bloat.

Louis

I like the idea of `_LIBCPP_HIDE_FROM_ABI`, and I think the first step you
proposed sounds reasonable.

1. Everything that is ABI stable is marked with a visibility macro like
today (no change here).
2. Everything that is marked with `_LIBCPP_INLINE_VISIBILITY` today is
marked with a new `_LIBCPP_HIDE_FROM_ABI` macro instead. This macro can be
defined to `__attribute__((__visibility__("hidden")))` or to
`__attribute__((__visibility__("hidden"),internal_linkage))` depending
on whether we want to support only use case (1), or both use cases (1) and
(2), respectively.

By ABI stable do you mean exported from the libc++ dylib, or all
non-reserved names provided by the library? (2) suggests the former, since
entities like std::sort are marked with _LIBCPP_INLINE_VISIBILITY.
Any entity which a user can reasonably expect to externally instantiate
cannot be part of the “hidden ABI”, since marking it with internal linkage
will cause linkage failures.

I think we'll be limited to applying `_LIBCPP_HIDE_FROM_ABI` to names
which are (A) reserved and (B) cannot be externally instantiated as a
member of a class.

I mean exported from the libc++ dylib.

For other readers:
Regarding linkage failures, Eric and I talked privately and he explained
that what he was worried about was a situation such as this one, where a
symbol with internal linkage is included in an extern template declaration:

// RUN: %cxx -c %s -o %t.first.o %flags %compile_flags
// RUN: %cxx -c %s -o %t.second.o -DWITH_MAIN %flags %compile_flags
// RUN: %cxx -o %t.exe %t.first.o %t.second.o %flags %link_flags
// RUN: %run

template <class T>
class Foo {
public:
   __attribute__((internal_linkage)) void bar();
};

template <class T>
void Foo<T>::bar() {}

extern template class Foo<int>;

#ifdef WITH_MAIN
int main() {
  Foo<int> f;
  f.bar();
}
#else
template class Foo<int>;
#endif

This indeed fails to link because the extern template declaration promises
that an instantiation will be available in `t.first.o`, but that
instantiation ends up having internal linkage and hence not being visible
to `t.second.o`. However, if `bar` is marked inline instead, the example
compiles because the C++ Standard requires the instantiation to still
happen in `t.second.o` even though an extern instantiation is promised, per
[temp.explicit]. I don’t think it is ever the
case that we use `_LIBCPP_INLINE_VISIBILITY` on a function that is not
marked inline in libc++, which is why my proposed solution works in
practice. Indeed, if I remove the inline from a function marked as
`_LIBCPP_INLINE_VISIBILITY` in a class that is externally instantiated, I
do get a linker error when running libc++’s tests.

So the gist of it is that Eric and I think Eric’s concern is not an issue
for libc++, but by not much at all.

This would look something like:

    #ifdef _LIBCPP_NO_INTERNAL_LINKAGE_FOR_ABI_UNSTABLE_SYMBOLS
    # define _LIBCPP_HIDE_FROM_ABI __attribute__((__visibility__(
"hidden")))
    #else
    # define _LIBCPP_HIDE_FROM_ABI __attribute__((__visibility__(
"hidden"),internal_linkage))
    #endif

In the future, we can decide which default behavior we want, but for now,
I suggest we stick to what we have right now, which is support for both (1)
and (2). It would be fine to change this in the future if we make that
decision.

When we don’t have internal linkage, I suspect we'll want to keep
`__always_inline__` as to prevent static libraries from providing each
other with incompatible function definitions (I think this could occur?).

The idea here is that some people don’t care about interoperability of
static libraries (or, equivalently, object files) built with different
libc++ headers. And since they don’t care about this, they’d rather not pay
the cost of inlining or internal linkage, which is code bloat.

The `__always_inline__` is always needed, at least when mixing ABI stable
and ABI hidden symbols in a explicitly instantiated template. Here's a
reproducer that should fail to link on GCC and Clang:

I don’t believe this example is conforming. Per http://eel.is/c++draft/temp.explicit#12, __init should still be instantiated in main.o, because __init is marked inline. I must be wrong because otherwise both GCC and Clang are compiling this wrong.

Louis

I’ve kicked off a thread on llvm-dev here to try to clarify the situation: http://lists.llvm.org/pipermail/llvm-dev/2018-July/124549.html

I want to avoid at all costs getting stuck like all the previous times this topic was brought up, so let me scale down the proposal to something that will still relieve much of the pain, while letting space for improvement in the future. The new proposal would be:

  1. Everything that is ABI stable is marked with a visibility macro like today (no change here).
  2. Everything that is marked with _LIBCPP_INLINE_VISIBILITY today is marked with a new _LIBCPP_HIDE_FROM_ABI macro instead. This macro is defined to __attribute__((__visibility__("hidden"), internal_linkage)), or to __attribute__((__visibility__("hidden"), __always_inline__)) if the internal_linkage attribute is not supported (e.g. GCC). There is no way to turn off both internal_linkage and __always_inline__, not until the issue brought up above is resolved. This would look something like this:

#if __has_attribute(internal_linkage)

define _LIBCPP_INTERNAL_LINKAGE attribute ((internal_linkage))

#else

define _LIBCPP_INTERNAL_LINKAGE attribute ((always_inline))

#endif

#ifndef _LIBCPP_HIDE_FROM_ABI

define _LIBCPP_HIDE_FROM_ABI attribute((visibility(“hidden”))) _LIBCPP_INTERNAL_LINKAGE

#endif

In the future, as a pure improvement, we can add an alternative definition of _LIBCPP_HIDE_FROM_ABI which uses neither internal_linkage nor __always_inline__, for those that don’t care about static archive interoperability but care about code size.

  1. We still don’t touch _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY for now.

Concretely, this will solve the problem of abusive inlining of the vast majority of functions whenever libc++ is compiled on Clang, which is already a big improvement. And then I can go back to the design board to solve the remaining issues, having at least made a some progress.

What do you think?

Louis

I’ve put up a patch implementing this scaled down proposal for review: https://reviews.llvm.org/D49240. Once/If we go forward with this, I’m committed to explore ways of solving the remaining problems, which are (please LMK if you see something else):

  1. _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY still uses __always_inline__ and we’d like to get rid of this.
  2. We’d like the ability to only specify hidden visibility, but let ODR-equivalent functions be deduplicated across TUs. This makes it unsafe to mix TUs built with different versions of libc++, but has the potential of saving on code size.

Louis

IIRC, parts of std::vector (like __push_back_slow_path) are also ABI stable.

Thank you for doing this, Louis! Our large stack frame problem has gone away.

Nice! Just to confirm, if you are using the candidate Clang-7, you are building with -D_LIBCPP_HIDE_FROM_ABI_PER_TU=1, right?

Louis

Yes, we are. Sorry about the delayed response.