Linalg.generic and reading elements two-by-two

Hi all,

I am still quite new to all things mlir-related and I was wondering if there’s a way with linalg.generic to read two elements at a time in a tensor/memref (perhaps with a stride?)?

I am thinking of a way to write something like:

linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel"]} 
   ins(%in : tensor<32xf32, strides:[2]>) outs(%out : tensor<16xf32, strides:[1]>) {
        ^bb0(%x: f32, %y: f32, %z: f32):
          %res = arith.add %x, %y : f32
          linalg.yield %res : f32

where x and y would both come from the input (%in), and be two consecutive elements.

Thanks in advance!

This looks like a reduction on %in, you could represent it like that in linalg. You can use tensor.expand_shape. It would look like this:

%0 = tensor.expand_shape %in [[0, 1]] : tensor<32xf32> into tensor<16x2xf32>
// Need to initialize the tensor to identity value. (that would usually get optimized away after vectorization)
%1 = linalg.init_tensor [16] : tensor<16xf32>
%out = linalg.fill ins(%cst : f32) outs(%1 : tensor<16xf32>) -> tensor<16xf32>
linalg.generic {indexing_maps = [affine_map<(d0, d1) -> (d0, d1)>, affine_map<(d0, d1) -> (d0)], iterator_types = ["parrallel", "reduction"]} 
   ins(%0 : tensor<16x2xf32>) outs(%out : tensor<16xf32>) {
        ^bb0(%x: f32, %z: f32):
          %res = arith.add %x, %z : f32
          linalg.yield %res : f32