libc++ is not using always_inline anymore!

Folks,

Today, I pushed a change that disables the use of always_inline for visibility purposes in libc++. The change was reviewed as https://reviews.llvm.org/D52405. This patch is the culminating point of several discussions about libc++'s visibility annotations.

This change should have a positive impact on code size for almost any application using the libc++ headers. For example, Eric Fiselier reported a 7% improvement in size for the Chrome binary after applying this patch. We are still in the process of gathering more data like this, but we expect similar gains to be possible in other applications. The debugging experience should also be much improved as we don't inline every function into its caller.

However, one word of caution: even though we were very careful not to break the ABI or any of libc++'s users ABI, this patch changes things that have been in place in libc++ for a long time. It is possible that we uncover problems with it, although we hope that won't be the case. If you start seeing problems you think are caused by this change, please notify me right away so we can take a look.

I'll post updates here as we're able to publish more data and observations about the results of this patch.

Thanks,
Louis

Folks,

Today, I pushed a change that disables the use of always_inline for visibility purposes in libc++. The change was reviewed as https://reviews.llvm.org/D52405. This patch is the culminating point of several discussions about libc++'s visibility annotations.

This change should have a positive impact on code size for almost any application using the libc++ headers. For example, Eric Fiselier reported a 7% improvement in size for the Chrome binary after applying this patch. We are still in the process of gathering more data like this, but we expect similar gains to be possible in other applications. The debugging experience should also be much improved as we don't inline every function into its caller.

This is amazing work. Thanks!

vedant

Thanks, Louis!

John.

Thanks for all the hard work Louis!

I know a lot of our users are going to appreciate this.

/Eric

Awesome!

What are the new semantics? That this ABI stability guarantee is provided by hiding the functions in each user so they can’t be deduplicated with anotehr user’s copy? (what about other copies that are from the same build? I guess even those won’t get coalesced/collapsed together? Would that be useful to support?)

I assume this doesn’t change the defaults, but does it make it any easier for users who don’t need the ABI stability guarantee? (or was it already easy/no change here?)

Awesome!

What are the new semantics? That this ABI stability guarantee is provided by hiding the functions in each user so they can’t be deduplicated with anotehr user’s copy? (what about other copies that are from the same build? I guess even those won’t get coalesced/collapsed together? Would that be useful to support?)

There are currently two modes (in LLVM trunk, and that is the plan for LLVM 8 too):

  1. (the default) All TUs linked together inside the same final linked image need to have use the same libc++ version. Inline functions are ODR-merged across TUs like they normally are. In this mode, we don’t use any funny attribute to control linkage (neither always_inline nor internal_linkage).

  2. (can be opted-in) TUs can be linked together even if they use different headers of libc++. This is achieved by using internal_linkage on implementation detail functions of libc++. Those functions are local to a TU and they are NOT ODR-merge across TUs. This results in more code duplication than option (1).

I assume this doesn’t change the defaults, but does it make it any easier for users who don’t need the ABI stability guarantee? (or was it already easy/no change here?)

It actually does change the default. However, it depends of what ABI guarantee you’re talking about.

  1. The ABI stability of the shared objects is always (and has always been, and will most likely always) be guaranteed. The only way to change that is to explicitly use the _LIBCPP_ABI_UNSTABLE macro, which says “use all the ABI breaking features”. This obviously only works if you’re also linking against a library that was built to provide that ABI. This ability to use the unstable ABI has been provided for a long time, it wasn’t the default, and it still isn’t the default – my change is completely orthogonal to that.

  2. The “ABI stability” of static archives is a different matter. The question here is whether you can link programs against static archives built with different versions of libc++. The answer used to be YES by default, not it is NO by default. If you want to retain that ability, you need to use the _LIBCPP_HIDE_FROM_ABI_PER_TU macro. And also please give us a heads up so we know someone is using it.

Does that answer your questions?

Louis

Yeah, that helps - answers some questions I didn’t know I had either! :slight_smile:

In general I’m worried of “undefined behavior” that isn’t caught by a tool, ideally at build time otherwise at runtime. I would really encourage to not introduce any default behavior where you can’t provide an easy detection mechanism to the user.

Can you please expand on what you mean here? Are you referring to the potential for ODR violations if someone links TUs built against different versions of the libc++ headers? If so, that situation exists for every single C++ library out in the wild.

Louis

More specifically, what I mean here is that anyone relying on this guarantee is walking an incredibly thin line, and so I think it is reasonable for such users to explicitly opt into the guarantee.

Louis

Awesome!

What are the new semantics? That this ABI stability guarantee is provided by hiding the functions in each user so they can’t be deduplicated with anotehr user’s copy? (what about other copies that are from the same build? I guess even those won’t get coalesced/collapsed together? Would that be useful to support?)

There are currently two modes (in LLVM trunk, and that is the plan for LLVM 8 too):

  1. (the default) All TUs linked together inside the same final linked image need to have use the same libc++ version. Inline functions are ODR-merged across TUs like they normally are. In this mode, we don’t use any funny attribute to control linkage (neither always_inline nor internal_linkage).

  2. (can be opted-in) TUs can be linked together even if they use different headers of libc++. This is achieved by using internal_linkage on implementation detail functions of libc++. Those functions are local to a TU and they are NOT ODR-merge across TUs. This results in more code duplication than option (1).

I assume this doesn’t change the defaults, but does it make it any easier for users who don’t need the ABI stability guarantee? (or was it already easy/no change here?)

It actually does change the default. However, it depends of what ABI guarantee you’re talking about.

  1. The ABI stability of the shared objects is always (and has always been, and will most likely always) be guaranteed. The only way to change that is to explicitly use the _LIBCPP_ABI_UNSTABLE macro, which says “use all the ABI breaking features”. This obviously only works if you’re also linking against a library that was built to provide that ABI. This ability to use the unstable ABI has been provided for a long time, it wasn’t the default, and it still isn’t the default – my change is completely orthogonal to that.

  2. The “ABI stability” of static archives is a different matter. The question here is whether you can link programs against static archives built with different versions of libc++. The answer used to be YES by default, not it is NO by default. If you want to retain that ability, you need to use the _LIBCPP_HIDE_FROM_ABI_PER_TU macro. And also please give us a heads up so we know someone is using it.

In general I’m worried of “undefined behavior” that isn’t caught by a tool, ideally at build time otherwise at runtime. I would really encourage to not introduce any default behavior where you can’t provide an easy detection mechanism to the user.

Can you please expand on what you mean here? Are you referring to the potential for ODR violations if someone links TUs built against different versions of the libc++ headers? If so, that situation exists for every single C++ library out in the wild.

When you provide a system library / core components, you can (should) hold higher standard than other convenience library that the user opt in in my opinion.
Also the “the others aren’t better” seems to me like a fairly lame excuse.

More specifically, what I mean here is that anyone relying on this guarantee is walking an incredibly thin line

I don’t know what you mean here.

, and so I think it is reasonable for such users to explicitly opt into the guarantee.

You’re missing the point: as a user building my application / libraries using Xcode for instance, I am not aware of what guarantee the compiler provides in this kind of area. This is incredibly subtle. How do I know what I need to know to “opt into the guarantee”?
This is not much different than undefined behavior in the language (expect less documented and more arcane), and we learned with time that dealing with undefined behavior for the average user is incredibly difficult.

So back to the point: safety and usability should come first when you supply libraries and tools to users. This is on the same line as an “API should be easy to use and hard to misuse”. If you don’t have an easy way to find/detect the problem that you can widespread as part of your environment/tools, then you created an “API that is too easy to misuse”.
(I know this is not really an API discussion here, but that’s not much different: this is a contract with a client).

I argue that whenever we’re introducing such behavior, we’re not delivering a good user experience.

I think that Chandler presented good foundations to approach this at CppCon 2016 in the talk “Garbage In, Garbage Out: Arguing about Undefined Behavior…": https://www.youtube.com/watch?v=yG1OZ69H_-o

Especially, since we’re going from a “wider” to a “narrower” contract here, this slide is particularly relevant and represent the concern I raised: https://youtu.be/yG1OZ69H_-o?t=1279

Best,

Awesome!

What are the new semantics? That this ABI stability guarantee is provided by hiding the functions in each user so they can’t be deduplicated with anotehr user’s copy? (what about other copies that are from the same build? I guess even those won’t get coalesced/collapsed together? Would that be useful to support?)

There are currently two modes (in LLVM trunk, and that is the plan for LLVM 8 too):

  1. (the default) All TUs linked together inside the same final linked image need to have use the same libc++ version. Inline functions are ODR-merged across TUs like they normally are. In this mode, we don’t use any funny attribute to control linkage (neither always_inline nor internal_linkage).

  2. (can be opted-in) TUs can be linked together even if they use different headers of libc++. This is achieved by using internal_linkage on implementation detail functions of libc++. Those functions are local to a TU and they are NOT ODR-merge across TUs. This results in more code duplication than option (1).

I assume this doesn’t change the defaults, but does it make it any easier for users who don’t need the ABI stability guarantee? (or was it already easy/no change here?)

It actually does change the default. However, it depends of what ABI guarantee you’re talking about.

  1. The ABI stability of the shared objects is always (and has always been, and will most likely always) be guaranteed. The only way to change that is to explicitly use the _LIBCPP_ABI_UNSTABLE macro, which says “use all the ABI breaking features”. This obviously only works if you’re also linking against a library that was built to provide that ABI. This ability to use the unstable ABI has been provided for a long time, it wasn’t the default, and it still isn’t the default – my change is completely orthogonal to that.

  2. The “ABI stability” of static archives is a different matter. The question here is whether you can link programs against static archives built with different versions of libc++. The answer used to be YES by default, not it is NO by default. If you want to retain that ability, you need to use the _LIBCPP_HIDE_FROM_ABI_PER_TU macro. And also please give us a heads up so we know someone is using it.

In general I’m worried of “undefined behavior” that isn’t caught by a tool, ideally at build time otherwise at runtime. I would really encourage to not introduce any default behavior where you can’t provide an easy detection mechanism to the user.

Can you please expand on what you mean here? Are you referring to the potential for ODR violations if someone links TUs built against different versions of the libc++ headers? If so, that situation exists for every single C++ library out in the wild.

When you provide a system library / core components, you can (should) hold higher standard than other convenience library that the user opt in in my opinion.
Also the “the others aren’t better” seems to me like a fairly lame excuse.

That’s not what I mean. What I mean is that anyone requiring this guarantee has to be very serious about it and either

  • audit all third-party libraries they use in their code for such ODR-violation inducing changes whenever they update said libraries, or
  • not have any dependencies (beyond the standard library)

In both cases, anyone requiring this guarantee has to actively know about that requirement and take steps to make sure nobody breaks them (even unintentionally). My claim is that it is reasonable (and actually desirable) for such users to be explicit about the fact that they use that guarantee.

More specifically, what I mean here is that anyone relying on this guarantee is walking an incredibly thin line

I don’t know what you mean here.

Thin line = they can be broken by anyone who is not careful enough.

, and so I think it is reasonable for such users to explicitly opt into the guarantee.

You’re missing the point: as a user building my application / libraries using Xcode for instance, I am not aware of what guarantee the compiler provides in this kind of area. This is incredibly subtle. How do I know what I need to know to "opt into the guarantee”?

You read the documentation of the library you’re using. The C++ programming language does not give you any such guarantees by default. If you have non-trivial ABI requirements, you need to be responsible about it. This is true with or without my change — I’m not changing anything here. People can’t (and usually don’t) expect the following use cases to work:

  • linking TUs built with different compilers
  • linking TUs built with different compiler flags
  • linking TUs built against different sets of headers
  • etc.

This is not much different than undefined behavior in the language (expect less documented and more arcane), and we learned with time that dealing with undefined behavior for the average user is incredibly difficult.

So back to the point: safety and usability should come first when you supply libraries and tools to users. This is on the same line as an “API should be easy to use and hard to misuse”. If you don’t have an easy way to find/detect the problem that you can widespread as part of your environment/tools, then you created an “API that is too easy to misuse”.
(I know this is not really an API discussion here, but that’s not much different: this is a contract with a client).

I think it may be possible to catch invalid uses of libc++. We could for example inject a version number in each TU that uses libc++ without the per-TU guarantee, and then maybe the linker could check that the version number is the same for all TUs. I haven’t thought a lot about that, but I think it may be toolable.

I argue that whenever we’re introducing such behavior, we’re not delivering a good user experience.

The choice is not that easy. What we’re contemplating here is on the one hand an obscure guarantee that very few people likely need, and on the other hand the potential for very significant code size savings. To me, not bloating my user’s application is also part of the user experience. I think the minority of people that have this flaky use case should be the ones to "pay” by having to turn the guarantee on explicitly.

Louis

Awesome!

What are the new semantics? That this ABI stability guarantee is provided by hiding the functions in each user so they can’t be deduplicated with anotehr user’s copy? (what about other copies that are from the same build? I guess even those won’t get coalesced/collapsed together? Would that be useful to support?)

There are currently two modes (in LLVM trunk, and that is the plan for LLVM 8 too):

  1. (the default) All TUs linked together inside the same final linked image need to have use the same libc++ version. Inline functions are ODR-merged across TUs like they normally are. In this mode, we don’t use any funny attribute to control linkage (neither always_inline nor internal_linkage).

  2. (can be opted-in) TUs can be linked together even if they use different headers of libc++. This is achieved by using internal_linkage on implementation detail functions of libc++. Those functions are local to a TU and they are NOT ODR-merge across TUs. This results in more code duplication than option (1).

I assume this doesn’t change the defaults, but does it make it any easier for users who don’t need the ABI stability guarantee? (or was it already easy/no change here?)

It actually does change the default. However, it depends of what ABI guarantee you’re talking about.

  1. The ABI stability of the shared objects is always (and has always been, and will most likely always) be guaranteed. The only way to change that is to explicitly use the _LIBCPP_ABI_UNSTABLE macro, which says “use all the ABI breaking features”. This obviously only works if you’re also linking against a library that was built to provide that ABI. This ability to use the unstable ABI has been provided for a long time, it wasn’t the default, and it still isn’t the default – my change is completely orthogonal to that.

  2. The “ABI stability” of static archives is a different matter. The question here is whether you can link programs against static archives built with different versions of libc++. The answer used to be YES by default, not it is NO by default. If you want to retain that ability, you need to use the _LIBCPP_HIDE_FROM_ABI_PER_TU macro. And also please give us a heads up so we know someone is using it.

In general I’m worried of “undefined behavior” that isn’t caught by a tool, ideally at build time otherwise at runtime. I would really encourage to not introduce any default behavior where you can’t provide an easy detection mechanism to the user.

Can you please expand on what you mean here? Are you referring to the potential for ODR violations if someone links TUs built against different versions of the libc++ headers? If so, that situation exists for every single C++ library out in the wild.

When you provide a system library / core components, you can (should) hold higher standard than other convenience library that the user opt in in my opinion.
Also the “the others aren’t better” seems to me like a fairly lame excuse.

That’s not what I mean. What I mean is that anyone requiring this guarantee has to be very serious about it and either

  • audit all third-party libraries they use in their code for such ODR-violation inducing changes whenever they update said libraries, or
  • not have any dependencies (beyond the standard library)

Great, aren’t we talking about the standard library indeed? Considering your last parenthesis, I don’t understand why your mentions of situation with random user libraries isn’t anything else than a digression here.
I’m not talking about a developer using a library found on Github, but someone just relying on standard implementation of C++ (i.e. a clang toolchain).
This is exclusively about the guarantee we chose to provide with the toolchain.

In both cases, anyone requiring this guarantee has to actively know about that requirement and take steps to make sure nobody breaks them (even unintentionally). My claim is that it is reasonable (and actually desirable) for such users to be explicit about the fact that they use that guarantee.

I really can’t understand your angle of approach here. You seem to come from a world where users requires/expect something from their libraries, and are very conscious about it.
I’m coming from a compiler supplier / tool vendor point of view and the guarantee/usability we should provide to our users.
Adding gotchas that aren’t well documented and/or appear in fine print seem to me to be a subpar user experience, not something I’d be happy to see from any tools.

More specifically, what I mean here is that anyone relying on this guarantee is walking an incredibly thin line

I don’t know what you mean here.

Thin line = they can be broken by anyone who is not careful enough.

I know what a thin line is, I don’t see how there is a thin line here.

, and so I think it is reasonable for such users to explicitly opt into the guarantee.

You’re missing the point: as a user building my application / libraries using Xcode for instance, I am not aware of what guarantee the compiler provides in this kind of area. This is incredibly subtle. How do I know what I need to know to "opt into the guarantee”?

You read the documentation of the library you’re using.

We’re talking about the standard library here…

The C++ programming language does not give you any such guarantees by default.

C++ does not even know what a shared library is AFAIK, and I doubt (but I’ll be happy proven wrong) it mentions anything about toolchains.
But I don’t see you point here: “we are allowed to indur pain to the user and be standard compliant”, OK but that does not mean we should.

If you have non-trivial ABI requirements, you need to be responsible about it. This is true with or without my change — I’m not changing anything here.

Well, maybe I am misunderstanding the situation then, but you’re previous quote lead me to believe you’re changing the situation:

The question here is whether you can link programs against static archives built with different versions of libc++. The answer used to be YES by default, not it is NO by default.".

People can’t (and usually don’t) expect the following use cases to work:

  • linking TUs built with different compilers

Uh? I rely on compatibility between clang and gcc, I don’t believe I’m alone, otherwise using clang on a system not built entirely with clang would be impossible.

  • linking TUs built with different compiler flags

I’m fairly sure a majority of people rely on this.

  • linking TUs built against different sets of headers

Not sure what this means: if TU foo.cpp includes foo.hpp and bar.cpp includes bar.hpp then you have two TU built against different sets of headers that I don’t see why I couldn’t linked them together.

  • etc.

This is not much different than undefined behavior in the language (expect less documented and more arcane), and we learned with time that dealing with undefined behavior for the average user is incredibly difficult.

So back to the point: safety and usability should come first when you supply libraries and tools to users. This is on the same line as an “API should be easy to use and hard to misuse”. If you don’t have an easy way to find/detect the problem that you can widespread as part of your environment/tools, then you created an “API that is too easy to misuse”.
(I know this is not really an API discussion here, but that’s not much different: this is a contract with a client).

I think it may be possible to catch invalid uses of libc++. We could for example inject a version number in each TU that uses libc++ without the per-TU guarantee, and then maybe the linker could check that the version number is the same for all TUs. I haven’t thought a lot about that, but I think it may be toolable.

Now we talk :slight_smile:
This really the only part of your answers so far that I feel is really targeting my concern: we should not introduce gotchas without thinking about usability/safety and how to catch issues.

I argue that whenever we’re introducing such behavior, we’re not delivering a good user experience.

The choice is not that easy.

You seem to present it like if there is only a choice between “doing this change and introducing a gotcha” or “not doing this change”.
It may be a situation we can end up into, and then we should just honestly consider the tradeoffs, but I believe we should end up there in only after spending enough time thinking about how we can “do this change and making sure we preserve user experience” (cf Chandler’s talk I linked for the concept).

What we’re contemplating here is on the one hand an obscure guarantee that very few people likely need, and on the other hand the potential for very significant code size savings. To me, not bloating my user’s application is also part of the user experience. I think the minority of people that have this flaky use case should be the ones to "pay” by having to turn the guarantee on explicitly.

(Side note: to be honest, most of the adjectives/terms you’re using like “obscure”, “few people”, “very significant”, “bloat”, “flaky”, by their lack of neutrality/objectivity, are likely gonna have the opposite effect on me than what you seem to expect. This gives me the feeling that you need to be “on the extreme” to reinforce the sense that your change is “the right thing to do”, and that you can just discard what does not fit without much consideration)

However “good” this change is, I don’t really see how it exonerates from what I’m asking: we should think hard about tools/framework to catch mistakes before introducing narrow contracts.

Best,

FWIW, MSVC implements this as a hard break:

“For the STL, mixing-and-matching has been explicitly disallowed via use of #pragma detect_mismatch in the STL headers since Visual Studio 2010. If you try to link incompatible object files, or if you try to link with the wrong STL library, the link will fail.” - https://msdn.microsoft.com/en-us/library/mt743084.aspx

This means, I think, that it doesn’t allow two libraries linking against two different versions of the standard library, that don’t use the standard library on their interface - which could in theory be made to work (& sounds like libc++ supports).

So a tradeoff of whether you enable/allow that situation, or break it so you can also break the really unsafe case where those two different standard library versions are used on the interface between the libraries.

I should apologize, there was a lot of discussion that I didn’t read all of, but I should point out that there are many people working hard on ODR violation detection tools. See Richard Trieu’s AST hashing stuff. In the long run, this kind of version header skew should be detectable with a tool, and users will still have this macro to get back the old guarantee so they can link anyway with that skew and have things mostly work.

In the meantime, yes, people with this problem (so far as I’m aware, nobody has indicated that they actually have this use case) may end up debugging some nasty crashes caused by header version skew. But, the cost of the always_inline “solution” for everyone else was too great, and in the last RFC we agreed that it was best to stop providing this guarantee by default. It’s not ideal for now, but we can expect to get better tooling here in the future.

Hey Reid,