Explicit template instantiations in libc++

Most of libc++ doesn't have explicit template instantiations, which
leads to a pretty significant build time and code size cost when using
libc++, since a large number of common templates will be emitted by the
compiler and coalesced by the linker. Notably, in include/__config, we
have:

    #ifndef _LIBCPP_EXTERN_TEMPLATE
    #define _LIBCPP_EXTERN_TEMPLATE(...)
    #endif

whereas before r189601 this was:

    #define _LIBCPP_EXTERN_TEMPLATE(...) extern template __VA_ARGS__;

This was apparently done to fix http://llvm.org/bugs/show_bug.cgi?id=17027,
but disabling explicit instantiations seems like a pretty big hammer
considering the drawbacks.

I'd like to restore these instantiations. Any thoughts on how to handle
things like pr17027 in a less heavy handed way?

Most of libc++ doesn't have explicit template instantiations, which
leads to a pretty significant build time and code size cost when using
libc++, since a large number of common templates will be emitted by the
compiler and coalesced by the linker. Notably, in include/__config, we
have:

   #ifndef _LIBCPP_EXTERN_TEMPLATE
   #define _LIBCPP_EXTERN_TEMPLATE(...)
   #endif

whereas before r189601 this was:

I think you mean r189610

   #define _LIBCPP_EXTERN_TEMPLATE(...) extern template __VA_ARGS__;

This was apparently done to fix http://llvm.org/bugs/show_bug.cgi?id=17027,
but disabling explicit instantiations seems like a pretty big hammer
considering the drawbacks.

I'd like to restore these instantiations. Any thoughts on how to handle
things like pr17027 in a less heavy handed way?

As far as I can tell, the issue in pr17027 was that we had a bug in libc++ and the libc++ dylib on some versions of OS X shipped with that bug. Disabling extern templates “fixed” that by causing the newer code from the headers to be used instead of the older code in the dylib on the system. The real solution here is to fix the bug in the dylib and ship a newer version. There may still be a lag in the time it takes us to release fixes in the dylib, but that is just the way it goes.

Given the dates, this seems unrelated, but there have been compile-time issues with libc++ for some time now: http://llvm.org/bugs/show_bug.cgi?id=14587

– Sean Silva

Sean Silva <chisophugis@gmail.com> writes:

Given the dates, this seems unrelated, but there have been
compile-time issues with libc++ for some time now:
http://llvm.org/bugs/show_bug.cgi?id=14587

Restoring these explicit instantiations will certainly help, but it
sounds like that PR is a bit more involved.

I ran some LNT numbers for posterity, and there are *many* compile time
improvements of 5-10% in our test suite, and no compile regressions.

           Performance Improvements - Compile Time Δ Previous Current σ
SingleSource/Benchmarks/Shootout-C++/strcat -11.50% 0.6436 0.5696 0.0005
SingleSource/Benchmarks/Misc-C++/bigfib -10.70% 1.1776 1.0516 0.0021
SingleSource/UnitTests/Vectorizer/gcc-loops -8.79% 0.9554 0.8714 0.0032
SingleSource/Benchmarks/Shootout-C++/matrix -8.08% 0.6670 0.6131 0.0010
SingleSource/Benchmarks/Shootout-C++/EH/except -7.94% 0.6474 0.5960 0.0027
SingleSource/Benchmarks/Shootout-C++/ackermann -7.26% 0.6377 0.5914 0.0012
MultiSource/Benchmarks/Prolangs-C++/employ/employ -7.07% 0.7394 0.6871 0.0026
SingleSource/Benchmarks/Shootout-C++/methcall -6.97% 0.6444 0.5995 0.0040
SingleSource/Benchmarks/Shootout-C++/lists1 -6.94% 0.7980 0.7426 0.0032
SingleSource/Benchmarks/Shootout-C++/ary3 -6.93% 0.7299 0.6793 0.0018
SingleSource/Benchmarks/Shootout-C++/objinst -6.91% 0.6517 0.6067 0.0021
SingleSource/Benchmarks/Misc-C++/Large/sphereflake -6.56% 0.7693 0.7188 0.0045
MultiSource/Benchmarks/Prolangs-C++/city/city -6.18% 4.5336 4.2535 0.0053
SingleSource/Benchmarks/Shootout-C++/sieve -6.06% 0.8268 0.7767 0.0014
SingleSource/Benchmarks/Adobe-C++/functionobjects -5.90% 0.4964 0.4671 0.0005
SingleSource/Benchmarks/Shootout-C++/ary2 -5.86% 0.6929 0.6523 0.0013
SingleSource/Benchmarks/Shootout-C++/nestedloop -5.81% 0.5985 0.5637 0.0009
SingleSource/Benchmarks/Shootout-C++/fibo -5.75% 0.5913 0.5573 0.0017
SingleSource/Benchmarks/Shootout-C++/random -5.63% 0.6007 0.5669 0.0005
SingleSource/Benchmarks/Shootout-C++/hash2 -5.52% 0.9050 0.8550 0.0013
SingleSource/Benchmarks/Misc-C++/Large/ray -5.40% 0.8162 0.7721 0.0007
SingleSource/Benchmarks/CoyoteBench/fftbench -5.28% 0.8547 0.8096 0.0008
External/SPEC/CFP2006/450_soplex/450_soplex -4.91% 48.1568 45.7941 0.0032
SingleSource/Benchmarks/Shootout-C++/lists -4.64% 0.7248 0.6912 0.0024
SingleSource/Benchmarks/Shootout-C++/ary -4.63% 0.6829 0.6513 0.0004
MultiSource/Applications/hexxagon/hexxagon -3.80% 2.1581 2.0760 0.0049
SingleSource/Benchmarks/Shootout-C++/hash -3.71% 0.8454 0.8140 0.0012
MultiSource/Applications/kimwitu++/kc -3.03% 18.1397 17.5900 0.0186
SingleSource/Benchmarks/Misc-C++/stepanov_container -2.49% 1.4438 1.4079 0.0010
MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 -1.67% 24.2177 23.8125 0.0248
External/SPEC/CINT2006/471_omnetpp/471_omnetpp -1.33% 54.8197 54.0908 0.0226

Bob Wilson <bob.wilson@apple.com> writes:

Most of libc++ doesn't have explicit template instantiations, which
leads to a pretty significant build time and code size cost when using
libc++, since a large number of common templates will be emitted by the
compiler and coalesced by the linker. Notably, in include/__config, we
have:

   #ifndef _LIBCPP_EXTERN_TEMPLATE
   #define _LIBCPP_EXTERN_TEMPLATE(...)
   #endif

whereas before r189601 this was:

I think you mean r189610

Yep, typo :slight_smile:

   #define _LIBCPP_EXTERN_TEMPLATE(...) extern template __VA_ARGS__;

This was apparently done to fix http://llvm.org/bugs/show_bug.cgi?id=17027,
but disabling explicit instantiations seems like a pretty big hammer
considering the drawbacks.

I'd like to restore these instantiations. Any thoughts on how to handle
things like pr17027 in a less heavy handed way?

As far as I can tell, the issue in pr17027 was that we had a bug in
libc++ and the libc++ dylib on some versions of OS X shipped with that
bug. Disabling extern templates “fixed” that by causing the newer code
from the headers to be used instead of the older code in the dylib on
the system. The real solution here is to fix the bug in the dylib and
ship a newer version. There may still be a lag in the time it takes us
to release fixes in the dylib, but that is just the way it goes.

Alright, given this and the obvious performance difference, I think I'll
go ahead and revert r189610. I'll keep an eye out for any pr17027-like
fallout and fix them up if they come up.

Justin Bogner <mail@justinbogner.com> writes:

Bob Wilson <bob.wilson@apple.com> writes:

I'd like to restore these instantiations. Any thoughts on how to handle
things like pr17027 in a less heavy handed way?

As far as I can tell, the issue in pr17027 was that we had a bug in
libc++ and the libc++ dylib on some versions of OS X shipped with that
bug. Disabling extern templates “fixed” that by causing the newer code
from the headers to be used instead of the older code in the dylib on
the system. The real solution here is to fix the bug in the dylib and
ship a newer version. There may still be a lag in the time it takes us
to release fixes in the dylib, but that is just the way it goes.

Alright, given this and the obvious performance difference, I think I'll
go ahead and revert r189610. I'll keep an eye out for any pr17027-like
fallout and fix them up if they come up.

I've done this in r215740.