[RFC] `MPI` Dialect

Hi, and thank you all for expressing your interest in this dialect. I will now briefly summarise this discussion and propose an agenda to move forward:

This overall discussion made the need for a message passing abstractions in MLIR very clear and our proposal of an MPI dialect at the interface level was well received. Our implementation of blocking communication received valuable technical feedback indicating that its design matches the expectations of the community. People also pointed out the importance of nonblocking communication, where preserving the def-use chain of request objects should be explored carefully. @fschlimb brought up potential optimizations on the MPI IR level, while @tschuett, @rengolin and others suggested that an additional higher-level dialect could enable further optimizations. @sogartar kindly pointed out the potential complexity when lowering MPI to multiple ABI-incompatible implementations. As a solution, @wence from Nvidia proposed that we target the common MPI 5.0 ABI where compatibility shims exist for other library formats. Multiple people also expressed the wish for complementary higher-level message passing dialects that are more geared towards optimizable operations (ccl, halo operations). Overall, many people in this thread and in numerous 1:1 discussions shared diverse use cases that an MPI dialect would support.

I suggest the following strategy to move ahead. The design of the core MPI dialect and the central blocking communication primitives as proposed in our first PR is ready for code-level discussions. @fschlimb already started the PR review and suggested adding return values right from the start. I updated the PR accordingly. I invite everyone to check if other technical details need to be addressed or if this PR is ready to go. I will explore the benefits of modelling the request objects of non-blocking communication directly in the def-use chain, and I will update you on the process here. To meanwhile continue the code review, I propose as a concrete 2nd step of our upstreaming process an initial lowering to the MPI 5.0 unified ABI, which will allow us to target other MPI ABIs through a thin shim library that already exists. Finally, the idea of higher-level dialects is very promising and their implementation will be easier as soon as the MPI dialect is established.

Let’s get the technical foundations right! Hence, please review our first PR if you have not yet done so. I am currently working on the lowering-PR and will post it here as soon as possible.

3 Likes