As mentioned in past discussions, the introduction of AVX512 intrinsics had a severe impact on the compile time.
The extreme case happens on windows where the mere inclusion of system headers like causes a hit in the compile time of applications that don’t even make use of any intrinsics.
During these discussions Chandler Carruth indicated that usage of clang modules is probably the best alternative to mitigate this problem:
Based on this feedback, Intel has started working on this proposal. Initial Internal measurements indicated that this direction is promising.
For example, the compile time of a single translation unit that includes all of the intel intrinsic header files would reduce by 90% starting from the 2nd compilation (by using cached pre-compiled modules). In general, we saw that compilation of projects that have more than one translation unit that includes x86intrin.h/immintrin.h will get big compile time reductions starting from the 1st compilation.
The biggest advantage of this approach is that it scales and would help also applications that do rely on intrinsics and hence will not be able to avoid the compile time overhead.
I (Intel) would like to add a Clang option that enables the modules feature exclusively for the Intel intrinsics header files, and set this option to be turned on by default. This allows clang to cache a pre-compiled (actually pre-parsed) module of all the headers included by x86intrin.h/immintrin.h, and any following compilation of #include<x86intrin.h>/<immintrin.h> directives will skip the parsing phase and save precious compile time. Since ‘clang modules’ and the regular ‘pre-processor include mechanism’ are not 100% compatible, I started by cleaning some small Intel intrinsics module bugs exposed by this work:
https://reviews.llvm.org/D23871, https://reviews.llvm.org/D24825, https://reviews.llvm.org/D24752 (This one is still in review).
I plan to upload the actual patch that adds the option I mentioned above somewhere next week hopefully. Once this patch is uploaded, you could experiment with it and see if it actually mitigates the problem on your end and provide us with valuable feedback on this direction.