I would like to restart the conversation regarding adding SVE support to LLVM. This time I am framing things from the code generation point of view because our immediate priority is llvm-mc support rather than auto-vectorisation. Can you please review the following text outlining MVT changes we would like to make so SVE instructions can be added to the AArch64 Target.
My overriding question is whether you think the new MVTs are acceptable and in addition if you agree it makes sense to replicate the change to the IR's type system so that all vector MVTs are representable within the IR. I have two needs-to-be-updated patches for the IR type changes (⚙ D27101 [Type] Extend VectorType to support scalable vectors. [IR support for SVE scalable vectors 1/4], ⚙ D27102 [Constants] Add scalable vector support to ConstantVector::getSplat. [IR support for SVE scalable vectors 2/4]) to help visualise the effect on the IR.
SVE data vectors have a length of (VL x 128) bits, where VL is a runtime property. From an instruction selection point of view the value of VL is unimportant beyond the fact it exists. I say this because taking addition as an example, there is only a single SVE instruction to unconditionally add two vectors of int64_ts, namely:
(1) ADD <Zd>.d, <Zn>.d, <Zm>.d
Given the runtime size of a Z (SVE data) register is (VL x 128) bits, such an instruction is defined as operating on (VL x 2) int64_ts. Likewise unconditionally adding two vectors of int32_ts:
(2) ADD <Zd>.s, <Zn>.s, <Zm>.s
operates on (VL x 4) int32_ts. In both cases the bit length of the data processed is the same (i.e. VL * 128).
Given the equivalent instructions for Neon make use of MVTs of the form:
v <#Elements> <ElementType>
it seems logical to use the same scheme for SVE but also incorporate the implicit (VL x) to distinguish from existing vector types. Hence we are proposing each vector MVT have a scalable vector MVT equivalent.
MVT::v2i32 -> MVT::nxv2i32
MVT::v2i64 -> MVT::nxv2i64
MVT::v4i32 -> MVT::nxv4i32
MVT::v4i64 -> MVT::nxv4i64
....likewise for all <#Elements> and <ElementType> combinations
The resulting SVE instruction selection is...
(1) Pat<(nxv2i64 (add (nxv2i64 $zn), (nxv2i64 $zm))), ADD_D_ZZZ // New SVE pattern
(2) Pat<(nxv4i32 (add (nxv4i32 $zn), (nxv4i32 $zm))), ADD_S_ZZZ // New SVE pattern
Pat<(v2i64 (add (v2i64 $vn), (v2i64 $vm))), ADD_D_VVV // Existing pattern
Using these MVTs we treat those where (#Element * sizeof(ElementType) == 128bits) as legal, with the others promoted or spilt accordingly. Floating point and boolean vector MVTs are the exception whereby the smaller than usually legal types are legal and considered to contain unpacked data within a larger container. The type legalisation of:
nxv2i8 ADD(nxv2i8, nxv2i8)
nxv2i8 TRUNC(nxv2i64 ADD((ZERO_EXTEND MVT::nxv2i8), (ZERO_EXTEND MVT::nxv2i8)))
Much of the legalisation code is common to all targets and by introducing scalable vector MVTs they also apply to SVE as long as the "scalable" flag is preserved when transforming MVTs.
To achieve this we want to popularise the use of functions like EVT::getHalfSizedIntegerVT as well as replace some common code uses of getVectorNumElements with another function that passes #Elements and the "scalable" flag as opaque data, using operator overloading when extending/shrinking #Elements proportionately.
At the worst case, any common code that can never work for scalable vectors would be guarded by the "scalable" flag.