Double Buffering with the Transform Dialect

Hello,

I am looking to perform double buffering on a select subset of allocations using the functionality added by Loop double-buffering/multi-buffering and the Transform dialect was hoping for some guidance on the right way to go about this.

At a high level, I want to inject some directives about what to double buffer while operating on linalg tensors and then actually instantiate the double buffering X passes later after bufferization has occurred and the IR has moved to a combination of scf loops and memref.copy operations.

The Transform dialect seems ideal for this since it doesn’t depend on discardable attributes carrying the info, but I wanted to either get sign off as suggested by the Transform dialect disclaimer, or find out if there’s a different way to go about this.

I’m aware of the existing PipelineDataTransfer.cpp but my scenario doesn’t utilize affine.dma_start/affine.dma_wait and there’s no way to selectively control which loops/allocations become double buffered.

Please see the usage of the transform op here: transform-ops.mlir - llvm/llvm-project - Sourcegraph

It may require more customization for your needs.

1 Like

Great, thank you!

Yes this will probably require few changes to the interface but this is the right direction in my opinion.

IREE is currently using multi-buffering along with sfc loop pipelining. We are planning to transition to the transform dialect in the mid-term future. So those change would be very useful to us and we will contribute to it at some point.

2 Likes

One specific improvement that will be necessary is to make sure transform dialect handles aren’t invalidated by the bufferization. This likely requires some mechanism in the bufferization driver to keep track of which tensor-level operation is bufferized to which buffer-level operation. For Linalg ops, this is straightforward as they bufferize to themselves so it sounds feasible in a small scope.

I suppose I am the closest to what we have as some sort of authority on the transform dialect, so here’s my sign-off.

1 Like

Would you suggest double buffering be added as an option to the existing transform.loop.pipeline op? I’m wondering if double buffering wouldn’t function better as a separate operation from pipelining for flexibility/better control of what is double buffered.

That’s exactly what I’d hoped for, thank you!