[RFC][SVE] Extend vector types to support SVE registers.

Hi,

I would like to restart the conversation regarding adding SVE support to LLVM. This time I am framing things from the code generation point of view because our immediate priority is llvm-mc support rather than auto-vectorisation. Can you please review the following text outlining MVT changes we would like to make so SVE instructions can be added to the AArch64 Target.

My overriding question is whether you think the new MVTs are acceptable and in addition if you agree it makes sense to replicate the change to the IR's type system so that all vector MVTs are representable within the IR. I have two needs-to-be-updated patches for the IR type changes (https://reviews.llvm.org/D27101, https://reviews.llvm.org/D27102) to help visualise the effect on the IR.

Many Thanks

  Paul!!!

SVE data vectors have a length of (VL x 128) bits, where VL is a runtime property. From an instruction selection point of view the value of VL is unimportant beyond the fact it exists. I say this because taking addition as an example, there is only a single SVE instruction to unconditionally add two vectors of int64_ts, namely:

(1) ADD <Zd>.d, <Zn>.d, <Zm>.d

Given the runtime size of a Z (SVE data) register is (VL x 128) bits, such an instruction is defined as operating on (VL x 2) int64_ts. Likewise unconditionally adding two vectors of int32_ts:

(2) ADD <Zd>.s, <Zn>.s, <Zm>.s

operates on (VL x 4) int32_ts. In both cases the bit length of the data processed is the same (i.e. VL * 128).

Given the equivalent instructions for Neon make use of MVTs of the form:

  v <#Elements> <ElementType>

it seems logical to use the same scheme for SVE but also incorporate the implicit (VL x) to distinguish from existing vector types. Hence we are proposing each vector MVT have a scalable vector MVT equivalent.

  MVT::v2i32 -> MVT::nxv2i32
  MVT::v2i64 -> MVT::nxv2i64
  MVT::v4i32 -> MVT::nxv4i32
  MVT::v4i64 -> MVT::nxv4i64
  ....likewise for all <#Elements> and <ElementType> combinations

The resulting SVE instruction selection is...

(1) Pat<(nxv2i64 (add (nxv2i64 $zn), (nxv2i64 $zm))), ADD_D_ZZZ // New SVE pattern
(2) Pat<(nxv4i32 (add (nxv4i32 $zn), (nxv4i32 $zm))), ADD_S_ZZZ // New SVE pattern
  Pat<(v2i64 (add (v2i64 $vn), (v2i64 $vm))), ADD_D_VVV // Existing pattern

Using these MVTs we treat those where (#Element * sizeof(ElementType) == 128bits) as legal, with the others promoted or spilt accordingly. Floating point and boolean vector MVTs are the exception whereby the smaller than usually legal types are legal and considered to contain unpacked data within a larger container. The type legalisation of:

  nxv2i8 ADD(nxv2i8, nxv2i8)

results in:

  nxv2i8 TRUNC(nxv2i64 ADD((ZERO_EXTEND MVT::nxv2i8), (ZERO_EXTEND MVT::nxv2i8)))

Much of the legalisation code is common to all targets and by introducing scalable vector MVTs they also apply to SVE as long as the "scalable" flag is preserved when transforming MVTs.

To achieve this we want to popularise the use of functions like EVT::getHalfSizedIntegerVT as well as replace some common code uses of getVectorNumElements with another function that passes #Elements and the "scalable" flag as opaque data, using operator overloading when extending/shrinking #Elements proportionately.

At the worst case, any common code that can never work for scalable vectors would be guarded by the "scalable" flag.

Hi Paul,

Nice to see new efforts in this area! :slight_smile:

        MVT::v2i32 -> MVT::nxv2i32
        MVT::v2i64 -> MVT::nxv2i64
        MVT::v4i32 -> MVT::nxv4i32
        MVT::v4i64 -> MVT::nxv4i64
        ....likewise for all <#Elements> and <ElementType> combinations

I'm ok with this notation, which can be easily represented by an
additional scalable flag.

        nxv2i8 TRUNC(nxv2i64 ADD((ZERO_EXTEND MVT::nxv2i8), (ZERO_EXTEND MVT::nxv2i8)))

Much of the legalisation code is common to all targets and by introducing scalable vector MVTs they also apply to SVE as long as the "scalable" flag is preserved when transforming MVTs.

This seems to work well with the new flag.

To achieve this we want to popularise the use of functions like EVT::getHalfSizedIntegerVT as well as replace some common code uses of getVectorNumElements with another function that passes #Elements and the "scalable" flag as opaque data, using operator overloading when extending/shrinking #Elements proportionately.

This makes sense for SVE, which has the concept of "multiple
multiples" of the scalar type instead of just larger vectors of
unknown size. I don't know how RISC V does it, but it would be good to
know if it's at least similar.

At the worst case, any common code that can never work for scalable vectors would be guarded by the "scalable" flag.

In the case of SIMD vs. SVE, I imagine the EVT functions above would
be "simple if/else on the scalable flag" wrappers. In the case where
RISC V is different, or if Intel wants to avoid larger vector sizes
for AVX1024, it would be slightly more complicated, but not by much.

Overall the idea looks sane and simple to me. Of course, we need to
wait for other people to offer their arguments, but having some code
to look at would help to gauge the amount of changes necessary (I'm
not expecting much).

A simple split for this series would be: new helpers working on SIMD,
new types supported, then expanded helpers to support scalable types.

cheers,
--renato

The RISC-V vector ISA remains a draft (see
https://github.com/riscv/riscv-isa-manual/blob/master/src/v.tex or
https://riscv.org/wp-content/uploads/2016/12/Wed0930-RISC-V-Vectors-Asanovic-UC-Berkeley-SiFive.pdf).
However, in its current incarnation there is no restriction that a
vector must be a given multiple of 128-bits. For any valid vector
configuration (e.g. a vector of F64 elements), you can depend on the
fact that requesting a vector length of 4 will always succeed.

Best,

Alex