[MLIR] Multi-dimension lowering of scalable vectors in mlir

@stevenvar and I are taking over @giuseros work about SVE handling in MLIR. (Linalg and masking). His work was targeting Axpy but we would like now to target multi-ranked operations.

We came across this RFC : [RFC] Add built-in support for scalable vector types - #5 by javiersetoain . Which suggests that types like those ‘would be supported’.


I tried to lower a few examples without much luck. I tried with a fixed first rank and all sizes scalable such as those.

module {
  func.func @lower_fixed_scalable(%arg0 : vector<4x[4]xf32>) -> vector<4x[4]xf32> {
    return %arg0 : vector<4x[4]xf32>

  func.func @lower_multi_dim_scalable(%arg0: vector<[3x4]xf32>) -> vector<[3x4]xf32> {
    return %arg0 : vector<[3x4]xf32>

I managed to generate llvm dialect with the first example running mlir-opt with -lower-vector-mask -test-vector-transferop-opt -convert-vector-to-llvm="enable-arm-sve" -convert-func-to-llvm. For the first example, I get this output which looks good to me.

  llvm.func @lower_scalable(%arg0: i64, %arg1: !llvm.array<4 x vector<[4]xf32>>) -> !llvm.array<4 x vector<[4]xf32>> {
    llvm.return %arg1 : !llvm.array<4 x vector<[4]xf32>>

But mlir-translate --mlir-to-llvmir will fail saying those vector types are not suitable. Indeed, we have :

ArrayType::isValidElementType(Type *ElemTy) { 
	return […]&& !isa<ScalableVectorType>(ElemTy);}

For the second example, mlir-opt fails while trying to generate the llvm Arrays. Did I miss something for the lowering of those cases ?

Is there an unrolling/flattening technique that would permit to drop the n-1 first dimensions ?

Hugo Trachino.


IIRC, SVE in LLVM had the concept of going through memory. IDK, if they still do this.

Dream: you could store your n-dimensonal scalable thing in memory and read many n-1 dimensonal things back.

We thought about implementing such feature. Inspired by the VectorFMAOpNDRewritePattern in ConvertVectorToLLVM.cpp. but it does not completely gets rid of the first dimension, it just extracts 1-D vectors out of it. You’d still have somewhere a multi-dimensionnal scalable vector. It also brings another problem for types where several ranks are scalable such as vector<[2x8]xi8> . Would you unroll it twice and then have “mask on the execution” ?

I don’t know the details yet. At higher abstraction you probably need an extract op to reduce dimensionality. There is an extract op for the interplay between scalable and fixed length vectors.

To target LLVM, you have to find a solution to find 1-D scalable vectors with predicates. Write the whole thing into memory and read 1-D scalable vectors from it?!?.

Hello, I have follow-up question about the lowering to LLVM: are LLVM arrays containing multiple scalable vectors really conceptually impossible to have?

Indeed I see there’s a check in LLVM that rejects an element type that is a scalable vector (as @nujaa has shown), but I also see in :gear: D94142 [IR] Allow scalable vectors in structs to support intrinsics returning multiple values. (llvm.org) that the same check existed for LLVM structs before the possibility for structs to contain scalable vectors was added by this patch.

I am far from an expert in the specifics of the LLVM backend, so I wonder if this limitation would be easily (or not) removed, and if allowing arrays of scalable vectors is feasible? Surely if structs can have it then arrays can have it too?

EDIT: From what I understand the decision of rejecting scalable vectors inside structs and array was mainly motivated by performance issues (cf. :gear: D64079 Scalable Vector IR Type (Try 3) (llvm.org) ), rather than an actual conceptual reason, is that correct?

There’s no conceptual reason not to allow arrays and struct of scalable vectors but there is a significant body of code that assumes the offset required to walk such data structures (especially when considering structs that contain both scalable and fixed sized types) can be represented as a single integer. Whilst this is fixable the cost exceeded the benefit, especially given we felt we’d be unlikely to get a similar change into C/C++.

D94142 is really a soft enablement of supporting structs of scalable vectors because there’s a requirement that such structs are effectively in-register only. This was done to allow multi-vector returns without having to resort to using silly vector types that cause unnecessary complication during code generation. The expectation being the only supported operations for such types are getelement and setelement.