Visibility/inlining issue in clang/libc++

We’re hitting an issue on Fuchsia which seems to be the result of two problems in clang. The first one should be easy to reproduce. Simply build libcxx with the latest clang on/for any non-mac system. I had to use LIBCXX_ABI_VERSION=2 to make the symbol table show this symbol at all for some reason but the issue exists regardless. If you now run readelf -Ws | grep _ZNSt3__213basic_ostreamIcNS_11char_traitsIcEEElsEPFRS3_S4_E you can see that std::ostream::operator<<(std::ostream& (*)(std::ostream&)) is local/hidden. This appears to result from -fvisibility-inlines-hidden being used and causing that operator to be marked as hidden. However, there is an extern explicit template instantiation at the bottom of (which is actually used on non-mac systems). So the compiler should be allowed to not inline the call if it so desires. However since libc++ does not provide a global definition of this symbol, the linker fails if the compiler chooses to do this.
The second issue is why the compiler is ever choosing to do this in the first place. This is harder to reproduce and I don’t have as much information on it. When you build Fuchsia in debug mode (-Og as the optimization level) it inlines the call to this operator in all translation units as we would expect from any sane compiler on any optimization level above -O0. However on non-debug builds (-O3 as the optimization level) it does not inline this call. This seems absurd however because one would think its just adding an extra call, passing more arguments, and increasing code size for no good reason. I’m not an optimization person though so I can only speculate. So -Og is somehow making a better inlining choice that -O3 it would appear. This issue just appears to be a regression in how well clang makes inlining decisions.
Does anyone more familiar with the compiler know where the code that handles this is? It’s currently blocking us from rolling a new toolchain on Fuchsia.


We’ve bisected this issue and pinpointed D50652 as the change that uncovered this issue. The underlying issue is different though: Clang seems to be ignoring always_inline attribute in some cases at -O3, specifically in our case it’s basic_ostream& operator<<(basic_ostream& (*__pf)(basic_ostream&)) which doesn’t get inlined even though it’s marked with always_inline resulting in an undefined symbol. I’ve filed a bug PR39053 to track this and I’ll try to get a better reproducer for this.

Why is nobody else hitting this? I’m not sure how many mainstream configurations use the latest Clang from trunk and libc++ with -O3, but I don’t think it’s that many, and this issue hasn’t started manifesting until recently. I’ve landed D52402 as a temporary workaround that reverts back to pre-D50652 behavior for our toolchain.

FWIW, D50652 was basically a revert of It reverted libc++ to the original behavior it had for several years. It is possible that this behavior is wrong somehow, but it would be a bit surprising (not impossible though!). Are you disabling/enabling any other visibility-related macros?

The long term fix for this is to adopt in libc++ and drop always_inline, which I plan to do for LLVM 8 (and in trunk as soon as is approved).