Linalg.generic with projection maps!

Hi Everyone,
Is it possible to somehow make linalg.generic to operate on a “projection of tensor” instead of elements. Something like that:

#projection_map = affine_map<(i, j) -> (j)>
#attrs = {  
  indexing_maps = [#projection_map , #projection_map ,#projection_map ],
 iterator_types = ["parallel"]
}


% = linalg.generic #attrs ins( %A,%B: tensor<5x?xf64>,tensor<5x?xf64>) outs(%C:tensor<5x?xf64>)
      ^bb0(%a:tensor<5xf64>,%b:tensor<5xf64>,%c:tensor<5xf64>){
            %r = call @some_func(%a,%b): (tensor<5xf64>,tensor<5xf64>)->tensor<5xf64>
             linalg.yield %r: tensor<5xf64>
} -> tensor<5x?xf64>

@ftynse @albertcohen @nicolasvasilache

What you are trying to do in not what linalg ops are intended for. Effectively the payload (the region of the op) operates on scalars. Off and on we have talked about generalizing it to support vector. tensor type in the payload seems like a stretch to me. At the very least the payload should operate on statically shaped objects. tensors can be dynamic shapes which probably falls outside of what Linalg is trying to model.

Maybe a little more details of the overall goal might help us suggest a better way forward?

thank you @MaheshRavishankar for your reply.

Off and on we have talked about generalizing it to support vector . tensor type in the payload seems like a stretch to me

Yes not necessary to have tensor type in the payload, it can be any statically shaped objects

Maybe a little more details of the overall goal might help us suggest a better way forward?

yes to give more background about my application: %A and %B tensors represent multi-fields in nD space dimension (in the given example we have 5 fields in 1D space). So I would like to iterate only over the cells (space dimension) and then operate some complexe opperations on the fields value at each cell.

There are probably two solutions for that but they are not convenient on my case: the first is to change the data structure to

tensor<?xvector<5xf64>>

But for some performance reason , I can not use this one.

The other one is to extract each field using ExtractSliceOp. This one should work, but on my real application I have 4 input tensors and 4 output tensors and the number of fields can be 5 or even more. So with field extraction I will have at least 2*20 tensors to handle. It will work, but it is “ugly”. That it is why I am looking for a more elegant solution :smiley:

You could add a loop and do extract slices within the loop to iterate over the different fields (IIUC). Essentially think of a Linalg op as a perfectly nested loop with scalar operation (when you write standard C code). What you are looking for seems to be something like an imperfectly nested loop code. The way to achieve that is to use scf.for and have linalg ops in the body.