Suppose we introduce a platform-independent, architecture-independent unified dialect, and a conversion dialect

Suppose we introduce a platform-independent, architecture-independent unified dialect, and a conversion dialect used for guiding conversion. what benefits will it bring us?

Learner’s perspective
Since there is a unified dialect for transfer, the relationships between dialects will become clear. some platform-dependent and/or architecture-dependent dialects will be converted as downstream of unified dialect through conversion dialects, such as X86Vector, AMX, ARM_SME, ARM_SVE, NVVM, ROCDL. meanwhile, some platform-independent and/or architecture-independent dialects as upstream of unified dialect will be easier to design, such as arith, scf.

In addition, this will allow the upstream of the unified dialect to be designed according to the following two rules, which will allow users to foolishly use the upstream dialect without caring about the downstream dialect, and allow the developer to foolishly discover and create and lowering new abstractions:

  • The ability of the upstream dialect is a subset of the ability of the downstream dialect.
  • Code implemented in an upstream dialect is no less efficient when reduced to a downstream dialect than when the same functionality is implemented in a downstream dialect.

Optimizer’s perspective

  • For upstream of the unified dialect, MLIR can optimize code through higher level abstractions(mlir’s original ability). for example, to eliminate the transpose of transpose of a matrix.
  • For downstream of the unified dialect, MLIR can optimize code through the guiding of conversion dialect. for example, convert to AVX will automatic analysis merge scalar calculations to SIMD.

Note:
the unified dialect should be low level and powerful enough. So that it can be automatically analyzed and converted according to the guiding of conversion dialect. With unified dialects as the dividing line, its upstream dialects are used to provide higher level abstractions, and its downstream dialects are used to provide machine-specific feature. the conversion will from a higher abstraction to a lower abstraction, then from lower abstraction merge to machine-specific middle abstraction.

The final dialect diagram looks like this, there may be errors due to limited knowledge:


In fact, LLVM IR fits into this unified dialect, but it is not an MLIR dialect.

This proposal is rather difficult for me to follow details.

How is it similar and how different?

This sounds like TCP proposal, is this indeed similar?

Can’t be: a single unified dialect should just be about tensors, what about memref? What about raw pointers? Otherwise how would we get Fortran and C++ mapped there as well! :wink:

1 Like

I don’t have a deep understanding of the transform dialect, they are only similar in expression, and not even. the transform dialect is used to guide transform, the conversion dialect is used to guide conversion.

I don’t have a concrete idea on how to design conversion dialects, just know that its purpose is to guide the unified dialect to which platform, which architecture, and which feature.

There is nothing new in this sentence, there are problems with my description. here should be to eliminate the transpose of transpose of a matrix. it is the original capability of mlir’s design, and it is said as a contrast to the second half of the sentence.

the memref and raw opinters may be part of unified dialect. As long as their semantics are designed to be architecture-independent. And that’s exactly what memref should be. Whether memref exists in memory or video memory is determined by conversion to gpu dialect or llvm dialect.

In addition, the unified dialect should be low level and powerful enough. So that it can be automatically analyzed and converted according to the guiding of conversion dialect. With unified dialects as the dividing line, its upstream dialects are used to provide higher level abstractions, and its downstream dialects are used to provide machine-specific feature.

I suppose you’re using the terms upstream and downstream as earlier/later or before/after, not in the open source way of in the main repository / in a personal/project repository.

This sounds just like the original proposal of MLIR: to lower the abstraction as you go.

We already have Torch/StableHLO/TCP, we already have Linalg/TOSA, we already have SCF/Arith/Math, we already have all of the hardware abstractions. All of those allow optimizations at their levels to be represented on the levels below.

Do you mean subset in the mathematical sense? If there’s a requirement for every higher-level dialect to be a subset of a lower level one, then composition becomes hard, as we fall into pure OOP inheritance and things get ugly really fast.

This is a nice property that is really hard to keep due to the imprecise nature of lowering. Not to mention the notion of measuring “efficiency” yields different results depending on who you ask or what you do.

Wait, is this unified dialect a central point in the design space? Where all higher dialects converge to and all lower dialects spawn from?

Very much so, but even wider scope.

It sounds like a single dialect to represent the entire design space of all possible dialects in MLIR: ML, HPC, procedural (C/Fortran), Graph, sparse/dense, and above all, hardware architecture aware.

If that’s the proposal, then take a look at XLA’s HLO, which tried to do that for a single domain (ML) into not many hardware architectures and became an impossible task. MLIR is more or less the answer to that bloat, so going back there is definitely not a reasonable option.

Exactly! It’d have to be a dialect that supports tensors, memrefs, vectors and would have to do its own architecture-aware bufferization, which already breaks the promise of being architecture agnostic, and well, becomes a mini-MLIR inside MLIR.

We’d end up with sub-dialects inside the meta-dialect, which would be indistinguishable from just plain MLIR.