Remove tight coupling of the BufferDeallocation pass to std and linalg operations

dfki-mako · January 11, 2021, 10:24am

Sounds good to me. However, would this also imply that we should revive the discussion about std.copy at this point?

I share your point of view in general on the one hand. On the other hand, according to my understanding, MLIR should always try to preserve “high-level knowledge” within the IR instead of creating nested patterns that might represent the same meaning but are more difficult to reason about. Taking this into account, makes the single-copy-op approach more appealing What do you think?

Hmm… implementing MLIR interfaces is always a good thing. However, I guess that dialect-specific operation are almost always “slightly magical”, as they express specific dialect/domain knowledge. The question is whether we can live with this small amount of magic that allows us to reason about bufferization in a more meaningful way

mehdi_amini · January 11, 2021, 5:27pm

Every operation has its own semantics of course, but all of them fits into a general more abstract model: an operation can write, read, subview, allocate, or free a memref. But IIRC these operation here want to have more constraint on the memref that can’t be described in the abstract and that no other pass/analysis can reason about.
We had a similar discussion recently on the linalg.padded_view revision where I had related concerns as well: ⚙ D93704 [mlir][Linalg] Introduce linalg.pad_tensor op.

dfki-mako · January 13, 2021, 10:38am

I definitely share your concerns about opaque ops that we can’t even reason about in any way in the scope of the MLIR core passes. The right way might be to simply split the operations off from the standard dialect and try to avoid further potential issues regarding (potentially different) copy operations in the first place. Since there is currently a lot of activity regarding splitting/bridging of dialects, it might be a good idea to come back to this discussion after we have decided how to proceed with the memref dialect

herhut · January 14, 2021, 4:39pm

Having a copy with implicit allocation is beneficial when optimizing buffer allocation, as one does not need to track the corresponding alloc for the copy and furthermore has the guarantee that it is unused. This special variant of copy could have the additional constraint that it is illegal to mutate its result, which is a general assumption in the current bufferization. Bufferization assumes that buffers behave like values to some degree (not aliasing, not mutated once computed).

For code generation, one would definitely want to lower this to alloc + some implementation of copy. In the case of linalg, likely a linalg.copy.

herhut · January 20, 2021, 5:26pm

Replying to myself here to pick this up again.

So, where do we want to take this? With the recent discussion about casts in dialect conversion, we can either have bufferize use the standard cast operation or define its own. In any case, it should be a specialized operation just for the purpose of dialect conversion and not general purpose (Hence its own dialect to make this clear. We can also use memref.bufferize_cast if that sounds better).

I would vote for having a specialized cast, as I would want a range of canonicalization patterns that are useful in partial bufferization to move scalars across the bufferization boundary. Specifying those on the generic cast based on the types involved seems wrong to me.

And, for the reasons I stated above, I’d also like a copy operation with implicit alloc similar to what tensor_to_memref does today. Again, memref.bufferize_copy works for me, too.

_sean_silva · January 20, 2021, 7:46pm

Can you give examples of this?

(I’m +1 on everything you said though)

mehdi_amini · January 20, 2021, 8:37pm

I’d call it memref.clone : it seems like its semantics does not have to be tied to the bufferization process.

+1 here

herhut · January 22, 2021, 8:57am

Works for me. That would give us memref.bufferize_cast and memref.clone. If that is what we can agree on, maybe @dfki-mako could get these added and integrated?

As that will replace the tensor_to_memref operation, should we remove that one again?

I think @_sean_silva discovered one of these patterns already, which is memref.load(memref.bufferize_cast). Other interesting cases are dim, rank, shape.shape_of. All of these get created by bufferize patterns on the bufferized form but can be forwarded to the tensor directly.

Topic		Replies	Views
MLIR Buffer Deallocation and memref store MLIR	1	254	August 31, 2021
BufferDeallocationInternal's canonicalization-of-the-target-buffer-of-the-clone-operation MLIR	3	211	August 21, 2023
LLVM dialect: replacing malloc and free with custom functions MLIR	3	506	June 29, 2022
MLIR News, 62nd edition (14th Feb 2024) Newsletter llvm-weekly	0	462	February 13, 2024
Confusion about one-shot-bufferization dealloc memory MLIR	4	212	October 28, 2023

Remove tight coupling of the BufferDeallocation pass to std and linalg operations

Related Topics