[RFC] Updated MLIR Dialect Overview Diagram

Hi, I’m looking to update the figure from this post showing the relationship and flow of the different dialects for an upcoming presentation and for general reference. The previous figure doesn’t reflect the split of the standard dialect in addition to new dialects that have been added since then so I figure an update is warranted. Dialects that are either new, the result of a split, or omitted from the previous diagram are marked in green.

WIP MLIR Dialect Overview

I’m fairly new to MLIR so I’m looking for help/feedback on refining the placement and flow of the newer dialects (and older ones if functionality has changed). I found it difficult to place them just based on documentation so feedback from the community would be greatly appreciated. Additionally if there are any frontends or targets that should be added that would be great to know as well. I’m not expecting to place all of the current dialects here (e.g. the work in progress ones), but I figured I would include them for now.

Thanks for any help!

Credit to @ftynse for the original post.

2 Likes

Arith is not directly under math/complex, it is also generated from lowering from TOSA/MHLO/… when going to linalg I believe (the body of a linalg generic will use these ops often, so to be used in conjunction with linalg).

AMDGPU, Arn-Neon and ARM-SVE should be at the same level as AVX and NVVM at the bottom I think.

Thanks for looking into a refresh. As a warning, while I did use this picture in presentations, I only flashed it for a couple of seconds to say MLIR is complex and we can’t go into all details.

Overall, I think the quadrant division from that picture may no longer make sense, hence the difficulty of placing some dialects with respect to those axes. A different, potentially more complex classification, is likely necessary. I haven’t given this question sufficient thought to propose specific criteria. However, if you want it to be systematic, consider the available conversions.

I also feel really strongly against branding some dialects as “work in progress” without any further categorization. It is a requirement for creating a new dialect to establish the relation between it and the existing ecosystem, so these relations are not really expected to change. Also, all of MLIR is work in progress to some extent.

Specifically for placement if you want to stick to this structure, there is a hidden distance-from-hardware-instructions metric along the vertical axis with the lowest dialects being close and the highest dialects being further away. Therefore,

  • complex and math are not above tensor;
  • arith is at a slightly lower level than “math”;
  • bufferization dialect is more of a utility, I wouldn’t say that tensor lowers to it or it lowers to memref;
  • memref does not lower to amx;
  • arm-neon, arm-sve, amx, are “instruction level” dialects same as nvvm and x86vector;
  • nvgpu and amdgpu are the same level, and I don’t think they lower to gpu.
1 Like

Thanks for the comments. It seems for nvgpu I misunderstood the wording in the documentation. A bridge to NVVM makes a lot more sense to me and the placement of amdgpu makes sense as well.

I’m not beholden to this four quadrant structure so I’ll take your advice and consider the conversion passes before continuing. I’m not looking for an in depth representation of the dialects but rather a general indication of the intended purpose of each dialect along with the lowerings between them.

Also for the work in progress section I had more meant it to indicate that I didn’t know where to place it, and if I didn’t include it in the diagram I was going to remove it from the figure altogether. I agree that simply placing them as WIP isn’t a good solution.

Thanks for refreshing this image! TOSA has direct pathways from TF, TFLite and Torch-MLIR. There’s a pathway from ONNX-MLIR in development.

1 Like

And from MHLO to TOSA (but that’s very early :slight_smile: )

1 Like

Oh and torch-mlir to MHLO is making progress too.

A difficulty with a diagram like this is that it depends on levels and paths. E.g., one needs to decide what relationship one wants to show else it gets to be edges all over the place (i have one like that where folks kept wanting me to add runtimes and it would have made for a royal mess).

1 Like

Oh TF to TFLite is the TFLite converter which is the defacto path to TFLite (and one of the oldest MLIR usages). There is also a path from TFLite to TF. All these paths at the high level has some gaps: with systems of a 1000+ ops that unfortunately happens and it’s almost impossible to keep up with ops added, important models supported is a more tractable goal. So I’d draw them all with solid lines except for something like MHLO to TOSA which is really early (10 ops supported level) vs TFlite to TOSA which has 70% coverage and thousands of models work via that path.

1 Like

Th…th…thousands ?! We’re still counting using dozens as unit. Glad to see that path in such solid use - it was the very first thing I implemented originally around TOSA!

In principle I like the idea of using different kinds of arrows to express the state of the legalizations on a path, but this diagram tends to be updated very infrequently and there’s a substantial risk of the information being out of date very quickly, and people making decisions based upon that.

Yeah I’m finding that the diagram is already starting to clutter so I’m looking at restructuring it to give clearer separation between external dialects and MLIR dialects.

1 Like

I agree that indicating support with different arrows is risky given that support for each of these paths are constantly evolving. I think it would be good to let this diagram focus more on the relation between MLIR dialects.

You may want to show Torch-MLIR → TOSA and Torch-MLIR → MHLO on there.

1 Like

NVGPU and AMDGPU should be parallel to GPU dialect and there should be NVGPU → NVVM, AMDGPU → ROCDL, GPU goes to both NVVM and RODCL (as it is already)

That’s what I’ve gathered from sifting through the conversion passes, but a confirmation is helpful.

Quite a few of conversions to SPIR-V are missing corresponding pointers in the graph, e.g., from CF, from MemRef, from Vector.

1 Like

Are there any connections that aren’t represented by the conversion passes? There were a number of missing connections in my first stab and I’ve been filling them in based on the dialects used in the conversion passes. Also for SPIR-V there are lowerings coming from Linalg, Math, and Tensor as well; is there any reason not to represent all of these paths?

@qed thank you for taking the effort to update this diagram. Maybe it should be checked into version control somewhere so anyone can update it as things evolve.

1 Like

That sounds like a good idea. I’m open to suggestions on the best way to do that.

My first instinct is that llvm/mlir-www repo may be good option for more visual material, that’s where the diagrams for the website also is. Downside is that reviews there don’t alert the same number of folks as phabricator does. But adding folks that would be best suited to review a given section or state there would still work. (graphviz is an old standard and easy to edit/small to check in but results are only OK, while my favorite drawing program is proprietary and that would not be open, a open format like SVG could work and enough free tools to modify).

We use GitHub - excalidraw/excalidraw: Virtual whiteboard for sketching hand-drawn like diagrams for torch-mlir and IREE diagrams. You can save / load the .excalidraw file and check it into version control since it is a txt file. It can export to SVG / PNG. You don’t need the “Pro” version for all this.

1 Like