[RFC] Restructuring of the MLIR repo

This is an early RFC, I haven’t fleshed out all the details but felt I’d be seeking feedback before spending too much time on this (and some help on figuring out the exact layout), so I’ll keep it short for now.

MLIR is both a compiler infrastructure and “batteries”, that is a collection of reusable blocks for compilers (mostly around CodeGen for Tensors, Dense and Sparse Linear Algebra and Affine/Polyhedral abstractions). We’re keeping expanding, and getting to the point where we’d like more “end-to-end” flow packaged in-tree.

I believe we reached the limit of our project structure, which is rather “flat” at the moment. It is hard to distinguish the “core” infrastructure from the more peripheral reusable components and compiler blocks. We also lack of a clear way to land more complete flow (for example it isn’t easy to land a TOSA or TACO like compiler right now).

So I’m proposing that we start reorganize the repo to account for this, as a first step here is a draft (it does not build, it’s a quick starts) that looks like this:

mlir/
    core/ # Segregate the infrastructure from the myriad of dialects.
        IR/
        Interfaces/
        Parser/
        Pass/
        PDL/
        Rewrite/
        Support/
        TableGen/
        include/ # public headers for `core/` libraries, installed in `#include "mlir/..."`
    analysis/ # Generic analysis using core interfaces
        AliasAnalysis
        CallGraph
        DataFlowAnalysis
        DataLayout
        include/ # Public headers for analysis, installed as `#include "mlir/Analysis/..."
    transformations/ # Generic transformations using core interfaces.
        Canonicalizer
        Inliner
        SCCP
        SymbolDCE
        include/ # Public headers for transformations libraries, installed as `#include "mlir/transformations/...`
    dialects/ # Same content as current dialects.
        Affine/
            IR/
            Transforms/
            Utils/
            include/ # Public headers for Affine dialect, will be installed as `#include "mlir/dialects/Affine/...`
        AMX/
        Arithmetic/
        ...
    examples/
    runtime/
        # Runtime libraries are "special": they aren't compiled for the host
        # like the rest of the compiler, but for the target.
    targets/
        Cpp
        LLVMIR
        SPIRV
    compilers/ # does not exist in the repo right now, but possible addition!
        PyTaco/
        TOSA/

This both adds a level of nesting at the top-level, and allows us to add new directories more easily (experimental could be a possibility at some point as well).

9 Likes

Strong +1 in general. I’m not sure about how exactly we structure the non-core parts, but the idea of splitting out the ‘generic’+‘core’ pieces of MLIR from everything else(where this includes many things that are workflow/“compiler” specific) has been a desire of mine for a long time. We regularly have problems of people accidentally mixing non-generic things with generic ones, which creates problems of how to layer the non-generic things. By splitting these pieces out, we should ideally be able to slowly converge to a model that is more amenable to MLIR’s current size and growth.

– River

Do you mean lib/core, lib/dialects, etc…? or are you proposing removing the ‘lib’ hierarchy?

I’d tend to agree that some restructuring isn’t necessarily a bad idea, with the goal of perhaps more clearly separating the infrastructure from dialects and transformations layered on top, but I don’t necessarily think that this prevents us from landing end-to-end flows. We could create a ‘compilers’ in short order, I think, or are there particular barriers to doing that that I’m not seeing?

I think the bigger barrier to merging comes for things like CIRCT or torch-mlir today, which do more than stitch together a flow, but also comes with their own dialects. I’m more concerned about how the Dialects continue to grow to support new use cases without an easy way to find out what is relevant.for particular problems. As River puts it “the layering fo non-generic things” is complicated.

2 Likes

What happened to the lib and include parts?!

I wouldn’t nest everything under a top-level lib. This is actually part of the issue: mlir/lib makes it so that everything is a sub-library.

True, the thing is that it does not fit well in lib, it could fit in tools, but tools is more for testing utilities. That said we could create a top-level container for this (whether we name it “compilers” or something else).

Yeah I agree, I thought a bit about taxonomy and hierarchy there, but I punted right now (not finding good answers). Talking to Jacques about this, he was also suggesting seeing “dialects” like “packages” and usually “package managers” would installe “packages” side-by-side without hierarchy (other metadata can organize packages independently though). Also, any hierarchy is also somehow limited by the flat namespace that we have (can’t register two dialects with the same name), which aligns with the flat directory structure here.
I also don’t expect this first iteration to be “perfect”, it’s a coarse grain refactoring to split “core” from the rest and tries to distinguish some high-level pieces from each others.

lib goes away from the top-level, you can still have compiler/TOSA/lib for library components specific/internal to TOSA for example, but there is no single “top-level library” that models MLIR: so core is a collection of libraries, with their public headers in an include folder stored under core/include (but installed and available just as before). I edited the original post to add some of the include/ folders that were missing.
We could also have core/lib/ and core/include/ but since these would be the only folders there I wasn’t sure the extra lib nesting would be useful. We could also shard more and have core/tests/ core/tools/ core/unittests independently from the other folders, in which case core/lib makes sense to me.
This opens-up question about having dialects/Affine/tests/ there as well, instead of top-level tests/dialects/Affine/... ; but I felt this can also be decided later: we probably don’t need to decide the perfect layout in the first iteration.

Some memories of LLVM a few years ago: it used to be that clang would have to be checked out in llvm/tools/clang, so we had tools/clang/lib/, tools/clang/include/, tools/clang/test/, alongside with lib/ tests/ … for LLVM.
When we went to the monorepo we had to think if wanted to have “things using LLVM” nested under LLVM or side-by-side with it. I see a similarity here where the nesting I’m proposing for core allows to keep it side-by-side with other things we want to grow here.

Would it then make more sense for the end-to-end tools to live outside of /mlir entirely (i.e., like clang/flang)? This could have both advantages and disadvantages, I suppose. Would this obviate the need for refactoring within MLIR as much?

Yes, this is a possibility, but we have much less flexibility: adding something at the top-level of the monorepo is likely something that will involves discussions with the rest of LLVM.

I don’t think so: we already have so many dialects and the vast majority of these aren’t really “core” to MLIR.

I’d prefer we be a bit wary on this end. If something is large and separate enough in scope, it isn’t clear that we should just put everything into mlir/ “just because we can” and just because “it uses MLIR”. I personally am wary of mlir/ becoming a large monorepo within a large monorepo.

– River

1 Like

I agree that something like “flang” for example clearly is out-of-scope, but I think we have many things that are more akin small end-to-end flow where it is mostly a “small” amount of “glue” and integration of the dialects and other components that are existing in MLIR: in particular all the small DSLs that operate on tensor-like or array-like abstractions are likely to get significant reuse of runtime libraries and other integration components.

I have the same questions and concerns as @stephenneuendorffer. Since it’s not great to keep moving around things multiple times to little functional value, it may be good to be sure of the benefits. What exact issue with the current setup prevents setting up end-to-end flows in the tree that you are trying to solve here? You already have tools and if tools was meant just for testing, its meaning could either be broadened or a new directory introduced.

Overall, I’m not sure what problem this new proposed setup is trying to solve that isn’t solved by the existing setup.

River developed some aspects in the first answer: this isn’t “just” about end-to-end flows.

Not really. I’m asking what problems this new setup solves that the existing one doesn’t. First, what’s the issue with landing end-to-end flows that you mention in the OP? The existing setup is already well-equipped for such and this setup anyway doesn’t make it any bettter. You are proposing moving some of the “more core” things from lib/ into core, mixing up the casing by going Analysis -> analysis and Transforms -> transformations (which is also a bit inconsistent with how it’s under dialects/). Moving directories around this way doesn’t prevent people from mixing non-generic things into “core” AFAICS, but it’s reviews and refactoring that help catch these. Did you instead want to add lib/core/ and include/mlir/core/.. to move the more core things inside of it instead of changing and introducing top-level partitions? All of this still adds an unnecessary and completely avoidable cut on what should be in core and what shouldn’t — with zero impact on modularity, dependencies, and layering. (You’ll still have MLIRIR, MLIRPass, MLIRRewrite, etc. – if a Core is added in those cmake target names, that’d be unnecessary and avoidable as well.)

I’m not opposed to refactoring but a strong -1 on doing this “slowly” as a series of cosmetic moves many of which add no or little functional value.

1 Like

Well I feel it’s the same thing as when you’re facing a very long source file with a single class and you refactor it in 3 classes. What does 3 classes solve that 1 couldn’t? Nothing really, but we do it mostly because it helps our mental model: we can look at things and describe them with less sentences/words, as well as all the maintenance aspects.

Nothing ever prevents anything from creeping really, there are things that become more apparently in review when you have structure though: an include from one library to another isn’t the same as includes in the same library.
Also technically you could forbid with static checks that validates that nothing from core includes anything from dialects (which is why PDL dialects stay with core).

No that’s not what I wanted, see the GitHub repo example I linked.
I’m trying to isolate the core mlir infrastructure entirely on its own, with a very clear boundary: it is core/lib and not lib/core.

This is a very long lasting complaint from River in particular to have the core of MLIR separated from “Linalg” for example (or any other similar dialect like “GPU”), and I’m on the same line here that this is a valuable thing to do in terms of project organisation.

I love the direction of this proposal - the current MLIR repo is getting to monolithic, and going to a more modular design makes a lot of sense to me.

I agree with this sentiment. MLIR isn’t an obscure little project anymore, I think “those LLVM people” will take it seriously.

Taking a few more top-level directories in the mono-repo would have a number of other benefits, including that #include "mlir/thing" and #include "tensors/thing" would be cleanly separated.

I would not split by “dialects vs analyses vs transforms” though, I’d split by domains: ML compilation is a separate thing from HW compilers, generic CPU IRs, etc.

Unrelated to this post, LLVM itself should go through a similar transformation - splitting Support/ADT into its own top level thing, splitting codegen+targets out to its own thing, and keeping LLVM IR be one thing.

-Chris

My personal opinion is that Linalg, and the things that depend on it, should be split out of the MLIR repo. It should either go into its own top-level thing, or perhaps into an incubator. It seems like its design is still evolving very rapidly, and I’ve heard a number of people be surprised by this - they thought integration into the MLIR repo implied a bit more stability…

-Chris

2 Likes

This conversation seems related to what I just posted an RFC about: [RFC] Moving TOSA construction lib into MLIR core . Frontends → TOSA → LinAlg is a functional path, but also one in development. Certain artifacts in TOSA also drove LinAlg-side development, e.g. TOSA to Linalg lowering (tosa.scatter) - #3 by hanchung .

In the quoted RFC I mentioned the problem with testing op constructors for a dialect like TOSA that sits at the interface from multiple frontends. A proposed way to address it is a dummy frontend dialect, but that brings its own share of concerns. It would nice to somehow integrate legalization test harnesses sitting within the real frameworks too.

Splitting things out in separate projects (like Affine, Async, Linalg, Shape, Sparse, Tensor, …) would also be a possible path indeed. I think there is a lot of value in the “batteries” that comes with MLIR though.
The “batteries” may be a bit too oriented towards “ML Compiler”, but that only reflects the community I think.

I’m looking forward to get more end-to-end flow for ML Compilation in the repo rather than less of it :slight_smile:
(and Linalg is a pillar of this right now: it is being adopted as the cornerstone of the CodeGen in both IREE and XLA CPU/GPU right now, it is also powering the Sparse compiler which will very soon give us end-to-end PyTACO examples in tree!)

I’ve felt for some time that all of this is far too monolithic as part of MLIR. I’m going to avoid sweeping categorization of what goes where right now, but from my vantage point, I do see clearly the following layers:

  • Core Infra: Love this part of the RFC. I think this should be cordoned off and very visibly its own thing
  • Mid and low-level tensor programming dialects: I think these make sense “in MLIR” directly but segmented into their own directories similar to proposed. In my mind, things like vector, affine, tensor, memref and a concept or two which need to be graduated out of linalg and exit cleanly to those lower levels belong here.
  • ML Frontends: I would make this its own top-level project in the monorepo. I think “ML opset” centric things go here, as well as the infra to get in and transform out to the tensor programming dialects. Since it was brought up, I think that Linalg is doing too many things and that many of them actually belong at this layer (OpDsl, the op registry/named ops, etc).

I think I agree with you from a round-about perspective in that I don’t think Linalg should exist in its present, monolithic form. However, I would like to avoid throwing babies out with the bath water and take the time to sieve out both the low level concepts that have proven worthwhile into the core project (this transition has been happening for some time – there are a ~couple things that really should be graduated out of the monolith) and clearly pushing the frontend oriented bits up-stack (or out of LLVM entirely). I am aware of a consensus emerging towards this direction among some folks, but I know a number of them are off time zone with respect to this post and we probably need to leave some room here to have an inclusive conversation. Given the timezones, I’d be shocked if we didn’t need f2f time to parse it as well.

Thanks for bringing this up. We’ve got quite a bit of things to do to recast some of these layers that are doing too much, and are therefore experiencing a degree of churn that should have a higher bar the further core-ward we go. It will take some time to really piece through it all, though.

3 Likes

I like a lot of things in this proposal!

For instance, this new organization feels right, because I always felt that our lib and include were too far apart!

mlir/A/*
mlir/A/include/*

Also,

mlir/analysis/
mlir/transformations/

provides a much easier way to define (or find) infrastructure that can be used independent of dialects.
Lastly, this

mlir/compiler/PyTACO/

is simply awesome, since we were looking for the right place of putting “end-to-end” solutions like this!

Missing from this overview is where to put our tests and integration tests, though.