RAMBLE: How to position new ML dialects in tree

Goal

To define a new, lightweight class of dialects that exist for the purpose of interop between projects in the ecosystem.

Motivation

I’ve been having discussions with a few folks about introducing an explicit class of “border dialects” (I have sometimes referred to these as “interface dialects” but this is easily confused with “dialect interfaces”) and wanted to get some pre-feedback on the line of thought before spending a ton of time on formalizing it.

As the MLIR ecosystem is growing, we are facing various interop frictions between projects, and so far this has created an asymmetry between in-tree and out-of-tree projects with respect to their ability to foster their own ecosystems without fragmenting the overall space. For things that are aligned with eventually going upstream, there are clear answers here (i.e. upstream your project), but there are legitimate reasons for projects to exist apart from the llvm-project proper yet still seek to interop with the ecosystem.

In my part of the world, this includes projects like IREE, TensorFlow, MHLO, JAX, ONNX, TOSA, etc. We’ve collectively made different decisions or non-decisions about each of these – for the Google aligned projects, we lean heavily on the “big Google integrate” to produce regular stable commits that are consistent. We do this at a pretty high cost, and even for us, it is an ultimately unmaintainable mechanism to be using to assure a fairly basic level of interoperability between components. This process is also quite opaque and inaccessible to anyone not-Google. I would like to start taking steps to unwinding our dependence on this while increasing the level of decoupling that we can support within the MLIR ecosystem.

The lack of solutions in this area has arguably also prompted some other degenerate behavior: since upstream is the only place that projects can effectively intersect, there is some motivation to push bits and pieces of tech where the utility only emerges within the full assembly of a downstream project (I don’t ascribe ill-intent here: I think there is often an intent to build out a more complete upstream component, but reality and other priorities can intervene).

I don’t think that there is any single mechanism or convention that will set all of this onto a better path, but I would like to advocate for one thing that I think we could introduce which would at least allow for downstream projects to design for decoupling and interop: an open registry of border dialects.

The idea is simple: Whereas most of the dialects in the MLIR system are accepted as a matter of design (i.e. do they serve a purpose to the upstream community, are they orthogonal or have well understood overlap with adjacent dialects, etc), we can introduce a new class of dialects that is specifically about interoperability between components. The justification for such a dialect to be accepted would be based on whether there exists a community to support it, it met certain standards, and it existed to aid interop between projects.

Definition

I would argue that introducing such a class of dialects would give us a rallying point to build out further infra important for interfacing components:

  • Tooling and conventions for dialect versioning
  • Efficient binary serialization (i.e. an equivalent to bitcode)
  • Tools for upgrading/downgrading/etc
  • Namespacing conventions
  • Build tooling for managing/registering dialects

Since these dialects are about interfacing components and not transformation, we could likely enforce some further restrictions on them in order to minimize maintenance costs:

  • Tablegen ODS only (limited use of advanced, C++ features)
  • Upgrade/downgrade via DRR (or PDL in the future?)
  • Requires version stability for ops and types
  • Self contained

Examples

To give some concrete examples, with some refactoring, all of the following could be characterized as Border Dialects:

  • TOSA: Stable, versioned representation of mid-level ML operations, intended to be used as an input specification for ML compilers.
  • IREE’s flow dialect: Structural ingress dialect representing the input to the IREE compiler, including types and ops that expose features specific to IREE’s conception of a runtime.
  • IREE’s hal dialect: Plugin dialect for inputs to separately compiled target-specific compiler backends.
  • MHLO: Mid-level MLIR representation of the historic XLA op set.
  • ONNX: Standards-based ML frontend dialect for representing whole ML models

It seems likely that for non trivial systems (i.e. IREE, XLA, etc), the system may choose to expose a set of public Border Dialects directly in the upstream llvm-project while maintaining other private, transformation oriented implementation dialects internally (possibly even duplicating the border dialects between import and implementation layers in order to facilitate decoupling).

Feedback?

I think that if we started small on this with some basic standards and a namespace, we could unlock some really nice benefits and create some room for more stability/ecosystem focused work in the future. Curious what others think about this approach specifically? Also curious if there are other ecosystem-stability related things we should be considering.

Thanks.
Stella

1 Like

Thanks for the RFC Stella!

I haven’t had time to digest the complete proposal, but I do have some preliminary questions(which I apologize if they are already discussed in the contents of the proposal):

  • Where do these things live?

Honestly this is my main question, because it interleaves with everything else. Where do you envision these dialects living?

  • If the idea here is for the dialects to mostly just be “connectors”, how can we structure the library layout such that we can more strongly enforce this?

Ideally we would prevent (in CMAKE) the major directories (analysis/transforms/the “normal” dialects) from every taking a dependency on these things, otherwise IME they do not remain “on the border” for long.

  • Who is on the hook for maintaining these? Do I need to ensure these are green every time I commit something?

A lot of the dialects you mention do not follow design conventions that we would enforce upstream (for better or worse). This seems like a high additional burden for upstream developers to need to context switch whenever dealing with border dialects that don’t follow convention. Or is the idea that these border dialects need to be redesigned to follow convention? It isn’t clear if what you propose is that we accept these dialects “as-is”, because I would be somewhat -1 if so. IMO these dialects should still go through design review like any other dialect, but if we were to accept this proposal the parts that become lax are the “political” aspects.

  • What tools are these linked into? Do these “border” dialects get linked into every tool? What is the opt-in mechanism if you intend to do something like the “experimental” backends in llvm?

Having been on the hook for maintaining a lot of (all of?) the dialects in your original post, I am particularly wary of the maintenance burden this may potentially place on upstream developers. I also am quite wary of how these will integrate/interact with the current mlir/ project if we just plopped them in mlir/Dialects/. In particular, without a better understanding of how these would be structured, the proposal as-is sounds like there would be a lot of “upstreaming” that only really benefits particular downstream projects and not really upstream. How do you envision preventing a “everyone can upstream anything” situation? I wouldn’t want mlir/ to become a monorepo.

– River

Instinctively, I think that these go “elsewhere” – not in mlir/Dialects but in somewhere that can depend on mlir/Dialects (and the other parts of the public API) but could not be depended on from them. If we could agree on that layering, we can probably bike shed the options in a more detailed proposal.

I think that some of these will be “official” (in terms of supported – likely because they interface parts of supported upstream projects) and others will be in-tree for the ecosystem benefit. There are some corollaries here with how targets are categorized, and I think it would be fine to start conservative on this point (erring on the side of their community members bearing the responsibility) and graduating if there is a greater need and track record. Again, fodder for a detailed proposal, but I would bias towards a low bar of entry and a low expectation of support until further time or evidence prompted us to increase that.

I think we would need a cmake config option for the optional dialects to link in. We could default that to those that we officially support to start with, effectively treating the others in a similar category as experimental targets. There will be some build engineering to make whatever scheme we decide on work.

In my mind, there is a redesign/simplification step – we want a genuine lack of creativity in these dialects, imo, and we want them to pass the bar for upstream code health. Speaking for IREE, because I have experience there, it eventually needs to grow an “input to the compiler” dialect that transforms into the lower level implementation dialects: it is rare that dialects not built to be a stable public interface can just be trivially declared to be such, so a redesign step is somewhat built-in from that perspective. We’ve got similar stories on the npcomp side: ops and types at the border that don’t belong in MLIR core but need to be somewhere common in order to interface to other projects in the ecosystem.

Having the opportunity to build that to upstream standards and extend the upstream tools to better support stable versioning and serialization semantics for dialects that opt in to it, combined with the ecosystem reach that would come from landing such a dialect there, would provide a strong motivator to do that work.

So, yes - design and code quality of the dialect is in scope. Whether the dialect can exist or how it fits in to the overall picture is left to the dialect owners.

I’ll also point out that solving this for a couple of the defacto border dialects that we’ve accreted is what gives us the stable interface points which provide the freedom to allow more things to drift from head reliably. While a quirk of Google’s development process, decoupling these components is a key thing I’m thinking through for reducing your associated workload specifically… Without stable representations accessible between components/projects, it seems like momentum will push us towards bigger and more version-lockstep monorepos.

So to be clear is the proposal for dialects that aren’t proper dialects and not complete? (E.g., verification, constant folding, printing, parsing missing). Or are they full dialects but just in a separate directory in the same repo but not considered core dialects? Somewhat akin to experimental backends in llvm that post being stable for some time may be accepted in llvm repo but need not be tested or built during regular development?

And this probably also means nothing can depend on border dialects, not even any other border dialect, only their consumers out of tree may. Else it does seem like they are just dialects in a different folder.

Isn’t this saying “one can only upstream which makes sense on its own”? That seems like a good push for generality of approach though.

Part of this proposal sort of reminds me of checking in header files from dependent projects into another, or header file generated from protobuf. So pure header only helpers. Duplication here also feels weird and not sure how that fits … Have you tried such a organization? I’m missing the light part and maintenance aspect here perhaps.

It is a bit unclear to me what is it that this intends to solve exactly? If these are “on the side” and fairly immutable and don’t include passes, transformations, conversions, etc. Then how useful are they in-tree? If someone is interested in any of such dialect they would very likely want to pull in all the ecosystem that comes with it: I see the value proposition of any dialect in how it fits in the ecosystem and that necessarily comes with more than the ODS itself.

For example it isn’t clear to me why it would be useful to have MHLO “as-is” in tree if this is only the dialect itself. Similarly, the IREE HAL dialect without the rest of the path from Linalg-to-HAL isn’t very interesting. But all of this seems very specific to IREE and isn’t very useful or testable upstream without IREE itself.

Conversely, storing all of this out-of-tree should be fine as well: if these are limited as you mentioned (only TableGen, etc.) then they can be versioned and work across a fairly large range of MLIR revision from upstream. I.e. an “umbrella” repo can be built to synchronize anything.

Let me separate what I’m reaching for from the mechanism I proposed. I’ll elaborate on what I’m trying to do below, as I clearly jumped a bit too quickly to one point in the solution space!

I’m going to forget you said “protobuf”, but yes, there are some similarities here :slight_smile:

What I am reaching for is any ability for us to have projects in tree and out of tree capable of having stable input and output dialects that can be accessed by other producers/consumers without being in lock step with respect to source revision. In my mind, since we don’t have source stability, this implies two things:

  • An entity, somewhere which provides “namespace services” and a home for definitions that everyone can access. The natural place for this is the upstream repo.
  • Some tooling and conventions that let us construct dialects that are designed for representational resilience across versions.

I believe that we very much do not want most dialects to be subject to such stability rules (instability is a feature of the implementation dialects), leaving us with dialects that are transformation oriented and those that are interface oriented (I am referring to these as “border” here). If we were better about nailing down such border dialects in a way that worked across the ecosystem and didn’t just devolve into a constant struggle to up-patch or down-patch unstable sources, then we would get much more stable interfacing across projects. If we could get it to the point that I could just grab a .td file, copy it to my project and be able to interop with other producers/consumers of that dialect, that would be ideal. I’m assuming, however, that there is probably some more code that goes along with it that I need to link in (i.e. upgrade/downgrade transforms, conversions in/out to core dialects, etc). It would be best if all of that lived somewhere that I already knew was compatible with the MLIR sources I was building my project with. In my mind, it is something like an RPC stub in terms of consisting of mostly declarative code, but also including marshalling code, and representing an artifact that must be shared between producers/consumers.

Practically, LLVM IR has adopted such conventions for quite some time, and this has fostered the growth of a pretty good ecosystem of downstreams that are able to interop with a common lingua franca. MLIR, however, as a meta-IR system shouldn’t necessarily be providing the equivalent to LLVM IR’s interop story directly – it should be providing the tools and structure for such an ecosystem to be constructed. And I would argue that getting this right for out-of-tree projects (without a constant fire-fight of source incompatibility between peers) is pretty important. From a certain point of view, “Border Dialects” are to MLIR as Target Backends are to LLVM.

So, yes, IREE needs this, and I used that as an example because I am the most familiar with it, but I will also say that outside of Google’s monorepo (which does not experience this pressure because it expends considerable effort to keep everything in lock step), most of the out of tree projects I come across (including those we want to build but can’t figure out how to interop), ends up gating on this: it can use the MLIR tools to build out its own island, but there is no way to interop with anyone else’s island. Practically, we cut ourselves on this regularly between IREE, NPCOMP and CIRCT, and basically just kick the can every time and try to stay close enough that things will work together (and we currently just punt entirely if custom types are involved – that is basically a non starter). That is fine while we are all in development but isn’t how you ship products. As a downstream, I am perfectly capable of defining a stable interface to my project and asking everyone to use it; however, as things are factored now, this tends to create unsolvable dependency hurdles, and I’m really reaching for anything we can do upstream to actually make MLIR usable for building such combinations – without each of them just fragmenting and solving for it on their own. To stretch the RPC analogy, I want MLIR to provide me tools and organization for defining and enabling usage of my stubs so that I can implement the level of stability guarantees that are right for my project.

Personally, I don’t know how MLIR moves ahead much further without having facilities where we can introduce some representational stability and dependency management (at the right places), and since that is the most important for out-of-tree, I want something that solves that dimension straight-away without the answer being to just upstream everything (not everything should be upstreamed, and even if it is, there are still process/version/tool boundaries in most of the things we are creating). I may be reaching for the wrong mechanisms, so I’m curious how others see to approach these problems? What we are doing now does not scale to the complexity we are building.

1 Like

If we’re going with the RPC analogy, it seems to me that this isn’t about “tools” but about publishing the actual “schema” for everyone’s RCP in the MLIR repo?

Something still unclear to me is the interaction between these “stubs” or the “schema” that these border dialects represent and the rest of the in-tree ecosystem.
For example: every “high-level” border dialect (like TOSA) would want to come with lowering into “core dialects”, and possible custom implementations for interfaces and hook into core transformations (bufferizations? etc.). On the other side, you have the same thing for “low-level” border dialect: would the code that maps and schedule linalg to “flow” continue to entirely live out-of-tree?

I think you’re touching on something. But I feel like you are underestimating just how cleanly an “RPC stub” for a dialect can be extracted. In reality, dialects, unless explicitly architected to (like TOSA) tend to incorporate significant transformation-oriented aspects into their definition (such as implementing downstream transformation-guiding interfaces). The amount of work to architect a dialect like an “RPC stub” is significant and I don’t think any of the examples on your list (except TOSA) really meet that bar, or are likely to without engineering cost greater than just making a new dialect specifically for interop.

I generally agree that if somebody has a dialect like this (I’ll add “no use of downstream op interfaces” to the list, to be clear), then having a space like you describe for them to put it makes sense. TOSA could be the first example of this and could be used to seed this directory (I think that conversions to/from upstream dialects should be allowed as well, such as TOSA->Linalg, TOSA->SCF). I think the question is how many projects will do the effort to reach this level such that it makes sense to have such a system formalized.

I don’t think this proposal really affects this. If anything, it sets the stage for further incentivization to keep things downstream. This approach specifically lets folks build walls an keep the “full assembly” isolated in their downstream repo.

Hi all,

This turned into a ramble, just a few observations and ideas:

I’m also curious to learn more about the goals here, partially because the examples are all different. Tosa is in tree in MLIR for example, ONNX is more of what I think of as an interface/border dialect between systems, and flow/hal seem like implementation details of IREE.

In the CIRCT project, we’re building a “firrtl” dialect to model FIRRTL IR from the FIRRTL project. This has been interesting (to me at least) because it tested the idea that mlir-translate should transform foreign input formats into a dialect that matches the input as close as possible (to improve testability, etc).

In practice, this fell apart rather quickly, and the FIRRTL specification and the firrtl dialect now have a number of differences. The reason for this is two-fold:

  1. MLIR is different/better than the original FIRRTL implementation in various ways, e.g. by having SSA, ops that support multiple results, etc; and we made different decisions in the type system to define away certain complexity.

  2. It isn’t “worth it” to have something that directly matches the .fir file and then have something else that the compiler works on. It isn’t worth it either in the “compile time” dimension (it is much faster to have the parser just produce the native-to-mlir form instead of parse into the wrong thing and do lowering), and it isn’t worth it in the “complexity and boilerplate” dimension either.

This has led me to be a bit skeptical of the academic idea of keeping translation as isomorphic to the input as possible. Along with that brings the realization that a well done dialect in MLIR (just like any well done IR!) is designed with the transformations and analyses it needs to serve in mind.

This is a long way of saying that I am a bit skeptical of dialects that mechanically follow external formats. I think that taste needs to be applied.


Getting off my tangent, I think that one of the broader points you mention is really important: standardization across the ecosystem. Let’s use ONNX as an example: this is a documented standard that many in the MLIR ecosystem are interested in. AFAIK, there isn’t a good place to host work on this, and to encourage collaboration among these folks. The options I see include things like:

  1. Start a new project on github. Always ok, but loosely connected to MLIR mainline. Difficult to get momentum. Doesn’t necessarily benefit from the LLVM collaboration/license model.

  2. Start a LLVM incubator project. This is what CIRCT is doing, and allows collaboration and things to get off the ground. Doesn’t distract the main MLIR project too much, CIRCT people are responsible for tracking MLIR whenever we want (every week or two in practice).

  3. Put things in MLIR as we do today. Puts maintenance burden on people doing global changes to MLIR, and MLIR is/should-be reticent to add dependencies on external tools (e.g. EDA tools in HW space, ML tools in the case of onnx, etc).

  4. Add a notion of “experimental dialects” to MLIR. LLVM has “experimental code generators” which are disabled by default, but enabled on some buildbots etc. We could have something like this for MLIR.

I get the sense from comments upthread that people are reticent to go with #3, and I think we should take a hard look at some of the stuff currently in tree and consider splitting out or deleting some stuff (how many “Bufferize” passes do we have in tree??). Also, I agree that there is too much creativity and needless divergence from standards in some of them (e.g. how the cmake goop is set up, directory structures, etc).

I’m not sure how this would happen, but I think it would be very interesting to define a new llvm subproject or incubator project that pulls together an end to end flow for code generation of ML graphs. Such a thing would be a natural place to put all the ML-specific integration work, without having pressure to put it into MLIR. By analogy, clang pulled all the c frontend stuff out of the LLVM repo, without preventing swift, rust, and other things to exist to exist out of tree. Maybe we need something like this for tensor codegen.

-Chris

I’ve gone ahead and renamed the thread to note that it is a ramble.

I’m not sure how we would do this either, but I’ve been feeling for some time that it would have made more sense if we had started from that position. We seem to suffer from a lot of “where do I mkdir” level problems that I feel are the result of co-mingling the distinct northstar of an e2e ML compilation story with the general infra that enables it. There is a lot of integration work that should exist somewhere but is not clear where – having it scattered into a plethora of satellite projects without any stable interfaces between them is causing (me) a lot of agita when it comes to thinking about how to pull things together into a usable whole. I don’t want to be polluting the “infra” mlir project, but there is no other home for a lot of this stuff. It would be interesting if we could write down a charter for such a project in a way that actually scoped it to be something concrete (vs a collection of loosely affiliated parts).

We need a “CIRCT” for ML :slight_smile:

I wish I’d managed to include adjacent “M” and “L” letters when I named npcomp – it’d be a lot easier to retronym :slight_smile:

I’ve been hoping that npcomp would grow into this, but the ML space seems so much more fractured with some large entrenched projects, that its harder to get traction.

I’m strongly aligned with the motivation of the original RFC here, but I think the decisions here are heavily sensitive to the dialect in question - I’ll thus only comment on the ones I’ve closed worked with. I really think MHLO and LMHLO should be moved upstream for all the reasons @stellaraccident mentions. I’m also willing to maintain or help maintain these as well as align them with any upstream design conventions. These dialects are also mostly ops and conversions into/out of those dialects not involving any elaborate analysis and transformation infrastructure. So it’s easy to bring them in.

2 Likes