Facing issues with bufferization cloneOp

rayaq · April 9, 2024, 7:28pm

Hello, I am running into some issues with the bufferization cloneOp, specifically for dealloc optimization.

The following is the initial module

module {
  memref.global "private" constant @shape : memref<2xindex> = dense<[1, 2]>
  func.func @forward(%arg0: memref<2x2xf32>, %arg1: memref<2x2xf32>) -> memref<2xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %alloc = memref.alloc() : memref<2x2xf32>
    linalg.fill ins(%cst : f32) outs(%alloc : memref<2x2xf32>)
    linalg.matmul ins(%arg0, %arg1 : memref<2x2xf32>, memref<2x2xf32>) outs(%alloc : memref<2x2xf32>)
    %alloc_0 = memref.alloc() : memref<2xf32>
    linalg.fill ins(%cst : f32) outs(%alloc_0 : memref<2xf32>)
    linalg.reduce ins(%alloc : memref<2x2xf32>) outs(%alloc_0 : memref<2xf32>) dimensions = [0] 
      (%in: f32, %init: f32) {
        %1 = arith.addf %in, %init : f32
        linalg.yield %1 : f32
      }
    %0 = memref.get_global @shape : memref<2xindex>
    %reshape = memref.reshape %alloc_0(%0) : (memref<2xf32>, memref<2xindex>) -> memref<1x2xf32>
    %collapse_shape = memref.collapse_shape %reshape [[0, 1]] : memref<1x2xf32> into memref<2xf32>
    return %collapse_shape : memref<2xf32>
  }
}

I run the following passes on the module above

mlir::bufferization::createOwnershipBasedBufferDeallocationPass [Passes - MLIR]
mlir::createCanonicalizerPass
mlir::bufferization::createBufferDeallocationSimplificationPass [Passes - MLIR]

This is the final module that I produce

module {
  memref.global "private" constant @shape : memref<2xindex> = dense<[1, 2]>
  func.func @forward(%arg0: memref<2x2xf32>, %arg1: memref<2x2xf32>) -> memref<2xf32> {
    %true = arith.constant true
    %cst = arith.constant 0.000000e+00 : f32
    %alloc = memref.alloc() : memref<2x2xf32>
    linalg.fill ins(%cst : f32) outs(%alloc : memref<2x2xf32>)
    linalg.matmul ins(%arg0, %arg1 : memref<2x2xf32>, memref<2x2xf32>) outs(%alloc : memref<2x2xf32>)
    %alloc_0 = memref.alloc() : memref<2xf32>
    linalg.fill ins(%cst : f32) outs(%alloc_0 : memref<2xf32>)
    linalg.reduce ins(%alloc : memref<2x2xf32>) outs(%alloc_0 : memref<2xf32>) dimensions = [0] 
      (%in: f32, %init: f32) {
        %2 = arith.addf %in, %init : f32
        linalg.yield %2 : f32
      }
    %0 = memref.get_global @shape : memref<2xindex>
    %reshape = memref.reshape %alloc_0(%0) : (memref<2xf32>, memref<2xindex>) -> memref<1x2xf32>
    %collapse_shape = memref.collapse_shape %reshape [[0, 1]] : memref<1x2xf32> into memref<2xf32>
    %1 = bufferization.clone %collapse_shape : memref<2xf32> to memref<2xf32>
    bufferization.dealloc (%alloc : memref<2x2xf32>) if (%true)
    bufferization.dealloc (%alloc_0 : memref<2xf32>) if (%true)
    return %1 : memref<2xf32>
  }
}

The issue I am facing here is with the bufferization.clone operation ['bufferization' Dialect - MLIR]. After running these passes, I want to optimize our deallocs to save memory for larger IR cases (i.e. move the deallocs earlier in the program, instead of at the very end). Based on the definition of bufferization.clone, it can either be a copy or an alias - this ambiguity is giving me issues for the dealloc optimization. Is there a better way to distinguish when bufferization.clone does a copy vs using an alias?

Thank you!

Topic		Replies	Views
Bufferization error related to ```memref.clone``` MLIR	2	464	November 19, 2021
MLIR Buffer Deallocation and memref store MLIR	1	253	August 31, 2021
BufferDeallocationInternal's canonicalization-of-the-target-buffer-of-the-clone-operation MLIR	3	209	August 21, 2023
I have some questions about BufferizableOpInterface MLIR	2	138	March 12, 2024
Transformations mutating read-only memref blocks MLIR	0	151	March 6, 2022

Facing issues with bufferization cloneOp

Related Topics