[mlir][linalg] `EraseIdentityGenericOp` canonicalization pattern

amirBish · December 10, 2023, 4:17pm

Hey, one of the linalg::genericOp canonicalizations patterns is EraseIdentityGenericOp which is removing the linalg generic op if it meets the two conditions:

all of the iterator types are parallel.
the body has only one linalg.yield.

When running the linalg.generic canonicalization’s patterns on an Identity linalg.generic which one of its output operand is a result of bufferization.alloc_tensor() It’s still erasing my generic op. Although, the alloc_tensor() op indicates a new bufferization/materialized tensor.

would attach an example for my code

func.func @slice_kernel(%arg0: tensor<4x4x4x4xf32>) -> (tensor<1x4x4x4xf32>){
  %extracted_slice = tensor.extract_slice %arg0[0, 0, 0, 0] [1, 4, 4, 4] [1, 1, 1, 1] : tensor<4x4x4x4xf32> to tensor<4x4x4xf32>
  %0 = bufferization.alloc_tensor() : tensor<1x4x4x4xf32>
  %1 = linalg.generic {indexing_maps = [affine_map<(d0,d1,d2,d3) -> (d0,d1,d2,d3)>, affine_map<(d0,d1,d2,d3) -> (d0,d1,d2,d3)>], iterator_types = ["parallel","parallel","parallel","parallel"]} ins(%extracted_slice : tensor<1x4x4x4xf32>) outs(%0 : tensor<1x4x4x4xf32>) {
  ^bb0(%in: f32, %out: f32):
    linalg.yield %in : f32
  } -> tensor<1x4x4x4xf32>
  return %1 : tensor<1x4x4x4xf32>
}

I expect not removing this linalg.generic since it’s not a default/redundant copy, shouldn’t the RewritePattern checks this case and fail in this case?

Thanks,

MaheshRavishankar · December 11, 2023, 6:01am

Maybe your program is a simpler representation of what your real problem is but as stated, it does make sense to me that %1 is replaced with %0. Essentially they are the same value, cause the second operation is just doing a copy, and removing the unnecessary copy makes sense to me.
(Also as written your repro should fail verification since the tensor type of %1 is not consistent, but that might just be a typo).

amirBish · December 11, 2023, 6:32am

Hey,
thanks for your reply Mahesh It’s indeed a typo (just edited manually my mlir code to make it simpler, modified it).
Removing this copy should not change the program behavior (since the user intended to allocate a new buffer for this tensor)?

mehdi_amini · December 11, 2023, 7:47am

The concept of « buffer for a tensor » isn’t really well defined: tensors are immutable value-based entities.
It’s not clear what is the observable behavior to preserve here?

rengolin · December 11, 2023, 10:03am

I don’t know what you mean by “default” copy, but this is value semantics, and that’s an identity operation. This is the simplest form of DCE and perfectly valid.

The confusion is probably coming from the fact that bufferization.alloc_tensor() is an op specific to the bufferization process and has no meaning outside of it. For example, the doc says:

" The result of a bufferization.alloc_tensor is a tensor value that can be used like any other tensor value.".

This means if you run cleanups on top of that code, DCE is free to remove identity operations because you are not in buffer semantics (memref) yet.

If you are using that op as a way to “allocate a tensor”, don’t. There is no such thing as “allocating a tensor”, as @mehdi_amini said. Just use tensor.empty and bufferization will (hopefully) know what to do.

jpienaar · December 11, 2023, 2:13pm

Side note: I feel like we’ve had a few rounds of confusion here recently and it may be good to formalize the answer somewhere (or if it is, make it more discoverable).

amirBish · December 12, 2023, 1:26pm

Thanks for your responses.

@mehdi_amini @rengolin @MaheshRavishankar @jpienaar I would try to explain my purpose:

I want to write a kernel/function in MLIR which has one tensor argument, and it returns a part/slice of this tensor argument, the kernel is in tensor semantics.
However, I want to guarantee that after running the one-shot-bufferization I would get a different memrefs/allocs for the argument and the return value of the kernel.

For this purpose, I’ve tried to insert operations like tensor.empty/bufferization.alloc_tensor() besides copying the slice of the argument to this new SSA value of the tensor.empty/bufferization.alloc_tensor(). Unfortunately, It’s being removed as part of eliminating redundant code.

I wanted to handle this issue in tensors. since passes which Tiles the linalg generic or fuses the linalg generic is happening in tensors. And handling it in memrefs would miss these passes.

thanks,

rengolin · December 12, 2023, 4:22pm

I don’t think you can guarantee this, even if you use things like tensor.empty() or memref.alloc(). Ultimately, the compiler is free to optimize memory usage as it sees fit and that’s a good thing.

Maybe you’re thinking of this as a user, not a compiler, and as such need to match your expectations accordingly. Or maybe your operations aren’t expressive enough and you’re misleading the compiler.

IF the buffer has different user chains, then the compiler cannot fuse them together and your expectation is held. But if the buffer only has a single chain of users (and the operation allows in-place semantics), then the compiler is free to remove the allocation. This is similar to register reuse in allocators: add r0, r0, r1 is perfectly valid if the previous value stored in r0 is dead after the add op.

If your code has some extra semantics that isn’t being propagated through the compiler (ex. volatile), then you need to add side effects to your ops to make sure the compiler can’t assume anything about the memory representation, and thus won’t try to eliminate buffers.

amirBish · December 14, 2023, 6:00am

Thanks, appreciate your detailed answer.

Topic		Replies	Views
Do not erase identity generic op with library_call field MLIR	16	565	December 22, 2021
Linalg.generic parallel vs. reduction MLIR	4	594	October 18, 2021
A question about linalg.generic's input/output buffers' aliasing MLIR	5	470	January 17, 2022
Understanding Iteration Space Inference for GenericOp MLIR	7	496	August 12, 2021
Plans to extend linalg optimization passes to memref operands MLIR linalg	5	341	April 25, 2023

[mlir][linalg] `EraseIdentityGenericOp` canonicalization pattern

Related Topics