Hello everyone.
The Linalg dialect supports tensor semantics fusion, even mixed fusion, as demonstrated in the following example:(mlir/test/Dialect/Linalg/fusion-elementwise-ops.mlir)
// CHECK-DAG: [[$MAP0:#[a-zA-Z0-9_]*]] = affine_map<(d0, d1) -> (d0, d1)>
#map0 = affine_map<(d0, d1) -> (d0, d1)>
// CHECK-LABEL: @mixed_fusion
func.func @mixed_fusion(%arg0: tensor<?x?xf32>, %arg1 : tensor<?x?xf32>, %arg2 : tensor<?x?xf32>, %arg8 : memref<?x?xf32>)
{
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%0 = tensor.dim %arg0, %c0 : tensor<?x?xf32>
%1 = tensor.dim %arg0, %c1 : tensor<?x?xf32>
%2 = tensor.empty(%0, %1) : tensor<?x?xf32>
%3 = linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]}
ins(%arg0, %arg1 : tensor<?x?xf32>, tensor<?x?xf32>)
outs(%2 : tensor<?x?xf32>) {
^bb0(%arg3: f32, %arg4: f32, %arg5: f32):
%4 = arith.addf %arg3, %arg4 : f32
linalg.yield %4 : f32
} -> tensor<?x?xf32>
// CHECK: linalg.generic {
// CHECK-SAME: indexing_maps = {{\[}}[[$MAP0]], [[$MAP0]], [[$MAP0]], [[$MAP0]]{{\]}}
linalg.generic {indexing_maps = [#map0, #map0, #map0], iterator_types = ["parallel", "parallel"]}
ins(%3, %arg2 : tensor<?x?xf32>, tensor<?x?xf32>)
outs(%arg8 : memref<?x?xf32>) {
// CHECK: ^{{[a-zA-Z0-9_]*}}
// CHECK-SAME: [[ARG0:%[a-zA-Z0-9_]*]]
// CHECK-SAME: [[ARG1:%[a-zA-Z0-9_]*]]
// CHECK-SAME: [[ARG2:%[a-zA-Z0-9_]*]]
^bb0(%arg5: f32, %arg6: f32, %arg7: f32):
// CHECK: [[T1:%[a-zA-Z0-9_]*]] = arith.addf [[ARG0]], [[ARG1]]
// CHECK-NOT: linalg.yield
// CHECK: arith.mulf [[T1]], [[ARG2]]
// CHECK: linalg.yield
%5 = arith.mulf %arg5, %arg6 : f32
linalg.yield %5 : f32
}
return
}
Does Linalg support fusion for memref only semantics? I saw a post mentioning that fusion for memref semantics was removed. I am implementing a DSL based on MLIR that requires a container to store elements of custom types. Since tensor does not support custom types, memref is used. The DSL’s IR is lowered to Linalg to leverage optimizations like fusion and vectorization, but I notice issues with memref-based fusion. What is the best practice in this scenario?