Can't convert strided memref to LLVM

Hello all!
I want to deal with strided memrefs to provide correct 256-bits memory alignment of data string beginnings (regardless the input matrix string size, that may not be divisible by 256 bits) to be able to use fast aligned vector 256-bits loads/stores.
As I’ve read in the MLIR docs, I create affine map that explicitly defines 2D-memref strides:

#my_map = affine_map<(d0, d1) → (d0 * 2048 + d1)>

Then I allocate strided memref:

%B = alloc() {alignment = 32} : memref<2048x2048xf32, #my_map>

And after that I try to fill it with ones:

%cf1 = constant 1.00000e+00 : f32
linalg.fill(%B, %cf1) : memref<2088x2048xf32, #my_map>, f32

Then I try to process this MLIR code by the following command:

mlir-opt sgemm-tiled-benchmark-my-maps.mlir -convert-linalg-to-loops -lower-affine -convert-scf-to-std -convert-std-to-llvm -canonicalize > a1.llvm

And then, after running this command, I can see the error:

sgemm-tiled-benchmark-my-maps.mlir:12:3: error: ‘std.store’ op operand #2 must be index, but got ‘!llvm.i64’
linalg.fill(%A, %cf1) : memref<2088x2048xf32, #my_map>, f32
^
sgemm-tiled-benchmark-my-maps.mlir:12:3: note: see current operation: “std.store”(%0, %6, %9, %11) : (!llvm.float, memref<2088x2048xf32, affine_map<(d0, d1) → (d0 * 2048 + d1)>>, !llvm.i64, !llvm.i64) → ()

The problem occurs when pass -convert-std-to-llvm works. If I remove pass -convert-std-to-llvm from command parameters, I can see no errors, i.e. other passes ‘think’ everything’s Okay.
What I do wrong? Why the error happens?
If I don’t use my_map with strides, everything goes well and there are no errors.

Thank you and BR, Oleg

1 Like

I am also interested in a follow up to this. Stumbled on similar issue when exploring affine_map<> with mem_ref and alloc instructions.

Furthermore, in the past, the test below would check an identity and a non-identity map during convert-std-to-llvm, but now it only checks the identity mapping with alloc:

Conversion to the LLVM dialect only supports alloc for memrefs with identity layout, that’s why the test was removed in [mlir] Require std.alloc() ops to have canonical layout during LLVM l… · llvm/llvm-project@04481f2 · GitHub. In the general case, it is impossible to compute the number of contiguous elements to allocate for a memref with an arbitrary affine layout. Strided memrefs can be obtained by allocating a contiguous memref with the sufficient number of elements and taking a view.

The conversion fails because it doesn’t convert the alloc but does convert the uses of the allocated memref. Arguably, it should inject casts instead.

1 Like

Thank you for the reply!

This makes sense.
Are there any other targets taking advantage of it yet?
Would you be able to point to a test/code of a different conversion target that is leveraging a non-identity layout? I am trying to understand how it skips llvm and maps directly to an accelerator that uses it.

Thank you for the answer, Alex.
I will try to implement this in the manner you have suggested.

There’s no code in tree that uses it as far as I know. Folks downstream seem to use it in some affine passes, see previous discussions:

1 Like