Plans to extend linalg optimization passes to memref operands

rohany · April 21, 2023, 5:34pm

I’m playing around with the linalg passes (fusion in particular). Right now, i have a sequence of two linalg.generic operations:

#map = affine_map<(d0, d1) -> (d0, d1)>
func.func @body1(%arg0: tensor<100x100xf64>, %arg1: tensor<100x100xf64>, %arg2: tensor<100x100xf64>, %arg3: tensor<100x100xf64>, %arg4: tensor<100x100xf64>) -> tensor<100x100xf64> attributes {llvm.emit_c_interface} {
  %1 = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0, %arg1 : tensor<100x100xf64>, tensor<100x100xf64>) outs(%arg2 : tensor<100x100xf64>) {
  ^bb0(%in: f64, %in_0: f64, %out: f64):
    %0 = arith.addf %in, %in_0 : f64
    linalg.yield %0 : f64
  } -> (tensor<100x100xf64>)
  %2 = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"]} ins(%1, %arg3 : tensor<100x100xf64>, tensor<100x100xf64>) outs(%arg4 : tensor<100x100xf64>) {
  ^bb0(%in: f64, %in_0: f64, %out: f64):
    %0 = arith.mulf %in, %in_0 : f64
    linalg.yield %0 : f64
  } -> (tensor<100x100xf64>)
  return %2 : tensor<100x100xf64>
}

The generic operations have a producer-consumer relationship, and the linalg fusion passes are able to fuse them. However, I’m planning on using this dialect in a setting where I the tensor value-semantics are not applicable. The function I want to write looks something like:

func.func @body2(%arg0: memref<100x100xf64>, %arg1: memref<100x100xf64>, %arg2: memref<100x100xf64>, %arg3: memref<100x100xf64>) attributes {llvm.emit_c_interface} {
  linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0, %arg1 : memref<100x100xf64>, memref<100x100xf64>) outs(%arg2 : memref<100x100xf64>) {
  ^bb0(%in: f64, %in_0: f64, %out: f64):
    %0 = arith.addf %in, %in_0 : f64
    linalg.yield %0 : f64
  }
  linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg2, %arg3 : memref<100x100xf64>, memref<100x100xf64>) outs(%arg2: memref<100x100xf64>) {
  ^bb0(%in: f64, %in_0: f64, %out: f64):
    %0 = arith.mulf %in, %in_0 : f64
    linalg.yield %0 : f64
  }
  return
}

where there is still a producer-consumer relationship between the generic operations, but I do need the operation to accept memrefs and eventually write into the output %arg2 (imagine that these allocations are performed outside of my control).

So are there plans to extend the linalg passes to doing any analysis on “bufferized” arguments, rather than tensors? Thanks

MaheshRavishankar · April 21, 2023, 7:49pm

Fusion on operations with memref semantics is really involved since it is hard to track dependencies without having explicit SSA use-def chains. One way here might be to use fusion on tensors and then use OneShotBufferization to convert the linalg operation on tensors into linalg operations on memrefs .

rohany · April 21, 2023, 8:53pm

This is subtle, so I’m sure i’m not getting the full details here. It seems like use-def chains are visible from the linalg generic ops even on memrefs with the in and out arguments to the operation right?

MaheshRavishankar · April 21, 2023, 9:29pm

Thats not a use-def chain. They are all just uses. See the difference between the tensors version and memrefs version. In the tensors version, one operation returns a result (i.e def) that is used in the next operation reads (i.e use). That is explicit in the IR. For memrefs you need to look at all uses and look for all possible uses and side effects when doing the fusion.

mehdi_amini · April 21, 2023, 9:44pm

And even so: there is also the question of aliasing (for example 2 different SSA values can be memref subviews from the same allocation).

rohany · April 25, 2023, 4:46pm

Yes, these concerns make sense. It does seem to me that if I did have aliasing information (i.e. restrict) on each of the input memrefs, it wouldn’t be that much harder than the tensor case. I’ll look into developing the transformation for my own use case.

Topic		Replies	Views
Linalg fusion MLIR	7	1443	April 2, 2020
Does the Linalg dialect support fusion-on-memrefs? MLIR	11	344	May 6, 2024
Tensor to memref conversion (a.k.a. bufferize) question MLIR	17	2804	November 10, 2020
[RFC] add linalg::IndexOnlyGenericOp MLIR	13	1274	March 6, 2020
Is it possible to use Linalg::GenericOp (Linalg::IndexedGenericOp) to express a computation having multiple shape-inconsistent outputs? MLIR	14	1161	March 16, 2020

Plans to extend linalg optimization passes to memref operands

Related topics