[libc++] Implementing P0214R7

Hi all,

A few weeks ago I chatted with Marshall about adding a P0214 implementation to libc++, and it seemed possible.

We (Google) do have an implementation (Dimsum, https://github.com/google/dimsum), but it’s heavily extended, incomplete, and non-conforming. To implement P0214 in libc++, it might be easier to start from scratch.

Here is the plan:
(1) C++11 compatible scalar implementation for simd<> and related traits/operations.
(2) Add an ABI that uses vector registers. Basically, it means to store data by __m128i/__m128 for x86.
(3) C++11 compatible simplistic implementation for simd_mask<T, ABI> and where expression in terms of simd<U, ABI>. U is an unsigned integer with the size sizeof(T).
(4) (optional and low priority) Implementation of [simd.math]. Start with a scalar + for loop implementation.
(5) Optimization by specializing on different platforms and use of native intrinsics. But for FWIW, modern compilers are good at auto-vectorizing for-loops and scalar operations, so we might not need that many specializations.
(6) (libc++ unrelated) Rebase Dimsum onto libc++ and keep Dimsum evolving. We’ll propose some of the Dimsum’s extensions to the TS.

Does it sound like a plan?