Several memref.subview's in an affine.for

I have some nD-memref that I am taking a rank-reducing subview of.
For example, memref<2x5xi64> -> memref<5xi64>, and I want 2 subviews, one of the 5 elements of the first dimension, the 2nd of the 5 elements of the 2nd dimension.

E.g., in NumPy style syntax:

x = np.array([2,5])
view_1 = x[0,:]
view_2 = x[1,:]
# do stuff with view_1
# do stuff with view_2

I can do this by writing out each subview manually, with an offset of 5 on the 2nd one (see first code listing below).

However, I would like to do this a bit more compactly, for example in an affine.for loop.
In NumPy style syntax this might be:

x = np.array([2,5])
for i in range(x.shape[0]):
    view = x[i,:]
    # do stuff

The way I’m doing it right now doesn’t work (see 2nd code listing), I get a error: unexpected ssa identifier error because my offset is coming from an SSA rather than being a hardcoded value.

Is there a way to achieve what I need?
I’ve been looking at how to use affine_map, but it’s not entirely clear to me what I would need to do.

Code listing 1: manual approach

func @main() {
  %i0 = arith.constant 0 : index

  %c0 = arith.constant 0 : i64
  %c1 = arith.constant 1 : i64
  %c2 = arith.constant 2 : i64
  %c3 = arith.constant 3 : i64

  %0 = memref.alloc() : memref<2x5xi64>

  affine.store %c1, %0[0, 0] : memref<2x5xi64>
  affine.store %c2, %0[1, 0] : memref<2x5xi64>
  affine.store %c3, %0[1, 1] : memref<2x5xi64>

  %1 = memref.subview %0[0,0][1,5][1,1] : memref<2x5xi64> to memref<5xi64, affine_map<(d0) -> (d0 + 0)>>
  %2 = memref.subview %0[1,0][1,5][1,1] : memref<2x5xi64> to memref<5xi64, affine_map<(d0) -> (d0 + 5)>>


  %v1 = vector.load %1[%i0] : memref<5xi64>, vector<5xi64>

  %v2 = vector.load %2[%i0] : memref<5xi64,affine_map<(d0) -> (d0 + 5)>>, vector<5xi64>

  vector.print %v1 : vector<5xi64>
  vector.print %v2 : vector<5xi64>

  return
}

Code listing 2: affine.for approach (invalid)

func @main() {
  %i0 = arith.constant 0 : index

  %c0 = arith.constant 0 : i64
  %c1 = arith.constant 1 : i64
  %c2 = arith.constant 2 : i64
  %c3 = arith.constant 3 : i64

  %0 = memref.alloc() : memref<2x5xi64>

  affine.store %c9, %0[0, 0] : memref<2x5xi64>
  affine.store %c2, %0[1, 0] : memref<2x5xi64>
  affine.store %c3, %0[1, 1] : memref<2x5xi64>

  affine.for %arg0 = 0 to 3 {
      %i = arith.index_cast %arg0 : index to i64
      %offset = arith.muli %i, %c5 : i64
      %1 = memref.subview %0[0,0][1,5][1,1] : memref<2x5xi64> to memref<5xi64, affine_map<(d0) -> (d0 + %offset)>>
      %v1 = vector.load %1[%i0] : memref<5xi64, affine_map<(d0) -> (d0 + %offset)>>, vector<5xi64>
      vector.print %v1 : vector<5xi64>
  }

  return
}

Seems you’d want to replace:

affine.for %arg0 = 0 to 3 {
  %i = arith.index_cast %arg0 : index to i64
  %offset = arith.muli %i, %c5 : i64
  %1 = memref.subview %0[0,0][1,5][1,1] : memref<2x5xi64> to memref<5xi64, affine_map<(d0) -> (d0 + %offset)>>   

by something resembling

affine.for %arg0 = 0 to 3 {
  %1 = memref.subview %0[%arg0,0][1,5][1,1] : memref<2x5xi64> to memref<5xi64, affine_map<(d0)[s0] -> (d0 + s0)>>
      

Also, instead of digging deeper into memref with affine map layout specs, I recommend you update MLIR and use the strided layout form; i.e. instead of

memref<5xi64, affine_map<(d0)[s0] -> (d0 + s0)>>

you’d have

memref<5xi64, strided<[1], offset: ?>>

which scales much better.

Many thanks, this has clarified things a lot, and does work.

I will look at updating MLIR at some point, so I can use that more concise syntax.