Our language has many vectors in it and I’m looking to make better use of the SSE instructions on my chips. Based on the experiments I did last week and the help you gave me regarding generating the right IR to generate sound SSE code, I’m ready to begin a major overhaul of our system. I have a big question remaining: if I’m running on an x86 system which is, say SSE-1 only, and we’re working with some double vectors, what happens when we do a compile? Does the compiler know to scalarize those vectors and use different instructions?
Spent some time today getting to know the X86Subtarget code better and
played around and artificially lowered the processor abilities to ensure
things would still go nice and smoothly. They do, so please ignore this
The answer is a resounding "maybe". For simple vector operations, the code generator does do a good job of scalarizing them. You can even have it turn <8 x float> into two 4 x floats.
There are some caveats though: the code is not highly tested, so it's possible you'll run into bugs or other unimplemented features. Further, if you're using SSE intrinsics to get access to comparisons or other features of SSE that you can't use with the native LLVM instructions, these intrinsics will not be auto scalarized.