I'm testing on older OS X 10.8 with older SSE4 hardware from about
2010. I've got updated gear from MacPorts and it includes GCC and
Clang. GCC is the compiler, and Clang is the assembler.
We perform a compile/link on a test file to ensure an ISA is supported
by the toolchain. If an ISA is available then we compile a source file
to the ISA as needed. Then, we guard the higher ISAs at runtime to
avoid SIGILLs. It worked well until we added AVX2.
For AVX2 we see this as expected:
$ CXX=/opt/local/bin/clang++-mp-5.0 make
/opt/local/bin/clang++-mp-5.0 ... -c chacha.cpp
/opt/local/bin/clang++-mp-5.0 ... -mavx2 -c chacha_avx.cpp
/opt/local/bin/clang++-mp-5.0 ... -msse2 -c chacha_simd.cpp
At runtime we catch a SIGILL due to chacha_avx.cpp as shown below. It
looks like global constructors are using instructions from AVX
(vxorps), which is beyond what the machine supports.
How do we tell Clang to use the base ISA for global constructors?
Thanks in advance.
There isn't any way to specifically restrict the ISA for global constructors/inline functions/etc. The inverse works, though: you can specify the base ISA for the whole file, then mark specific functions using __attribute__((target("avx2"))).
It looks like this is becoming more of a problem as CPU advance and
folks try to add multiple implementations.
xcode - Proper way to enable SSE4 on a per-function / per-block of code basis? - Stack Overflow .
The problem with attributes is, it is too new. They did not appear
until GCC 5 for x86_64, and GCC 6 for ARM. They also seem to be
missing for some platforms, like MIPS and PowerPC. We support back to
GCC 3 and Visual Studio 2002 for our sources so we need something more
Besides the target attribute, there are basically only two possibilities:
1. Modify the source code so the file in question doesn't have any global constructors/functions with weak linkage/etc.. (The simplest way to ensure you aren't using any problematic constructs is to use C instead of C++.)
2. Put the code into a separate library and dynamically load it with dlopen().