Numpy/scipy op set

Do you already have a more fleshed out description of what would be inside a “numpy” dialect and how that would differ from the work to be done on TCP/Linalg?

From a policy perspective, I think there are different questions here:

  1. How and when do we introduce dependencies on external projects?
  2. What sort of criteria do we have for new dialects?

The first is complicated - there is precedent in LLVM for accepting conditional dependencies, and MLIR already has some aspects of that (for CUDA support etc). In the case of Julia dialect, you have a further concern which is that Julia already depends on LLVM - LLVM depending on Julia would be a cyclic dependency. In any case, I’m pretty sure that Stella is not suggesting that MLIR should depend on Numpy, so this is a question we can continue to struggle with, but isn’t core to this thread, so we can set it aside for now.

The second is complicated in other ways. We have experimental work in-tree (including the shape dialect) and there are other things like the Affine dialect that are evolving. IMO we should have some sort of policy like LLVM does for targets: it can/should be reasonably easy to get something in, but there needs to be a motivated use case, the community needs to (at least) not be strongly opposed to it, and there should be someone signed up to support it.

An aspect of this that is different than LLVM core is that MLIR has a lot of opportunity to grease the wheels on interchange between different systems, and we have a community built around collaboration (which is not always strong in every open source project). I don’t think that pushing everything out to alternate projects is really the right answer.

OTOH, The reason for conservatism here is that we don’t want every science project to be in-tree (to be clear, I’m not accusing Stella of that!) and don’t want things like course projects etc to be in the tree. OTOH, I think it is pretty clear that an ONNX dialect is something that would be useful, and I also think that Numpy is a pretty big standard as well.


Not yet but have been ruminating on it for a while. I was literally at the point where I was deciding a) do I start a fork of LLVM and get something fleshed out there, b) pork-barrel more experimental stuff into my project (IREE), or c) create a new project. I would approach the work and community-engagement differently based on that choice.

I’d like to get the policy point worked out independent of the technical discussion, but just to orient you a bit as to what I suspect it would include:

  • some types for concepts that don’t fit in MLIR’s type system (i.e. the ndarray/dtype hierarchy is not a 1:1 mapping to a core type but we do want to interop with core types)
  • some kind of modeling of the dual roles that the ops operate in (immutable-value-based/mutable-buffer-based)
  • some kind of generic ops representing the structural concepts such as ufuncs
  • 1:1 mapping of numpy functions to ops (possibly based on the generic ops mentioned above)

All of that should be familiar from other frontend work but needs to be elaborated for each to make sure that the definitions are both faithful/useful representations of the source and can convert nicely to the next layers down.

For me an important point is “how does it compose and interact” with whatever else is in tree, which may goes against this decoupling to some extent.
But yes the information you provide in the bullet points is going in the direction of what I would expect to be figured out when proposing new components/dialects upstream.

This is also an aspect where it is different from the LLVM backends the “experimental” situation there: the backends in LLVM are cleanly separated from each other and fits into the target framework: it is an easier consideration to just ask “is there a public ISA and who cares about supporting it”.
An analogy I would make to compare new dialects in MLIR to something in LLVM could be proposing a collection of new intrinsics to LLVM alongside a set of specific passes to manipulate them.

By the way, we already wrote a beginning of a guideline a while ago (it is fine to update it, nothing is set in stone).

I think that it is often a good way to do this to get started, experiment, and iterate very quickly on a proof of concept in order to have a better idea about what is the proposal.

Agreed on the POC approach, but I’m only going to do the work in the context of the LLVM tree and community if there is interest and a reasonable chance that this aligns with what would be accepted. Since there are no “source” dialects in tree, it is either because a) we meant it to be that way, or b) we just haven’t gotten around to deciding on our stance (and related things like where to put it, etc). I suspect (b) and that we are falling victim to the first instance of a class of things being really hard to justify creation. To date, we’ve buried such discussions in some very long threads relating to different layers and I wanted to get some explicit consensus before spending time on something that later may not fit. The bar to get such new things in has not seemed low to me, and I think our judgment has been too conservative.

Concretely, I would like to see us delineate a place in the tree for these “Source” dialects and grant that reference ONNX and Numpy dialects likely satisfy some conditions for inclusion (totally making the following up but feels right to me):

  • They are relatively neutral standards (where ONNX has more of a formal standards process and Numpy is more of an industry-standard with a lot of work put in by various parties over the years to architect it as such).
  • From above, they align with the source system that they are modeling: i.e. they contain modeling for ops/types that mirror what exists upstream, potentially with internal layers to help cover the abstraction gap between source system and the TCP level.
  • They sit above the next level down (what we are currently calling TCP but also includes things like LinAlg) and we generally want them to layer in that way (i.e. versus having in tree conversions to some hard-coded LLVMIR generation or something).
  • They are largely descriptive and do not add dependencies that LLVM would wish to avoid (circular, diamonds or otherwise).

I’d like to see us get over the barrier of having no source dialects while also getting the layering right so that we don’t end up with abstraction leakage or one blessed source, etc. I think that for select sources, having them in tree will help raise the visibility and help us architect some of the higher layers in a bit more of a cohesive way.

@bhack due to me tying this together with ONNX

I don’t want to be too much off-topic but probably I will be with this post. I am really not in the compiler circle :slight_smile:
All these positions seems to me reasonable but my impression, that I’ve expressed on the other thread, was simply about the upstream/downstream relationship that could be quite general.
Are we sure that it is positive for the project to have a “large” out-of-tree “fragmented” space and receive only some sparse downstream requests where stuff doesn’t fit at the level of abstraction that you have defined?
I think that by an evolutionary point of view there is need of cross-pollination between inter-dialact (in-tree) and also (out-of-tree) intra-dialect.
It is important to care about abstraction (and so generalization?) but it is also quite important aimportant to share enough “common space” and be able to fit real world needs.
If we look at dialects not only by a pure flexibility point of view or opportunity to build new things but also by a potential conceptual and semantic fork about new concepts I am not 100% sure that collecting only downstream requests (so the total out-tree scenario) it is the safer path for the project.

I think I agree with you but may be using slightly different words. I think there would be a lot of value in specific (relatively) standard source dialects being in-tree in some fashion. The main thing I would like to avoid is layering-violations where we take something like a Numpy or ONNX dialect and try to lower it directly without going through a concentration layer that is optimized for transformation like TCP/LinAlg is trying to be. I believe that by pulling a couple of high utility/quality reference-dialects like these in-tree, this will increase visibility and help keep us honest about this layering, whereas now, with every source dialect out of tree, it is extremely hard to navigate, discuss cohesively and see the simplifications.

The main con I see for Numpy and ONNX specifically is that they may be too similar to provide a truly different perspective, but I assume that if we seed with a couple, it will make it easier to highlight others that may be more truly different.


@stellaraccident I agree. Sorry I used a little bit “philosophical” language for having more abstraction :slight_smile:

without going through a concentration layer that is optimized for transformation like TCP/LinAlg is trying to be

This is what I meant. Are we sure that it is better that these concentration gaps are autonomous filled in third_party teams with some sparse independent request downstream on we currently are possitioned (i suppose TCP/LinAlg)?
Also I will add that probably this concentration gap probably will be filled independently for each project and it is hard push at that level to share things cross projects with independent teams in independent upstream repositories. It has some similarity to the fork behavior where the intrer-fork visibility it is hard to achieve. But if that level has no-interest by a compiler stack point of view we could not care about that.

The main con I see for Numpy and ONNX specifically is that they may be too similar to provide a truly different perspective, but I assume that if we seed with a couple, it will make it easier to highlight others that may be more truly different.

This could be true but what else we could have on board to enrich the initial overview?

This approach makes a lot of sense to me, and it especially makes sense about “greasing the wheels”. While a lot of the design process has not been carried out in public view, there has been a lot of legerdemain in the construction of the various TensorFlow source dialects and lowerings – and the lack of any upstream development in these areas creates a real blindspot for the project, imo. Taking a direct stake in some stable/neutral work in this area would be a nice consolidating force.

I’ve got several side projects at the moment but am going to add a numpy prototype to my list and see what I can knock out.

In the meantime, I’m having trouble visualizing where dialects such as these would live in the source tree. mlir/Dialect/Numpy doesn’t seem like the right layer when considering the common infra dialects that would be peers. We should also consider that there tends to be multiple dialects per source, modeling different levels of the interaction, so we should be looking at a parent directory, not necessarily a leaf.

I could see a case for having such things exist in a new “Frontend Dialects” directory tree, either at the MLIR or llvm level. There may also be an incubation vs final location aspect.

How about starting a precedent of mlir/Dialect/ExperimentalNumpy or something like that? That would make it clear to people who “drive by” but don’t follow everything that it is an experiment.

1 Like

Why experimental?
Either we have a plan to have this in core, in which care it does not need more labelling than anything else (because really: what isn’t “experimental” right now?), or this is too early to even know what do we want to do with this dialect, in which case having this in-tree isn’t an obvious tradeoff to me to get this under mlir/ at all.

You can argue for more experimental in-tree collaboration, but why limit to dialect? Let’s bring this up on llvm-dev@ and try to ask about a top-level experimental/ in the monorepo for anything that is experimental (llvm passes, etc.).
(I rather see this happening in a fork though)

I’m ok with either approach so long as we have a consistent policy that is applied equally to all things.

The reason I suggest “experimental” is that there is a natural gradiation of stability. You point that “not much is fully stable” right now is true, but I think it misses the fact that the LLVM dialect (for example) is certain to exist and is relatively stable. It can change (just like anything in the universe) but much less so than something like a numpy dialect that is thrashing around and moving quickly. We’ve had similar things with the fxp dialect that were experimental and caused confusion.

it is all gray, but I would also argue that affine is relatively stable. After the initial design work happened, an ONNX dialect would also be relatively stable because it is tied to an external standard. I suspect there are others.

I guess I’m saying that “unstable in the name is a good warning that you really shouldn’t use it unless you are part of the design team or actively following it”. I’d add that “stable doesn’t have to mean 100% unchanging :-)”


FWIW I’m generally very wary of having the ability to add dialects, experimental or not, very open and easy (at least in the mlir/ directory). I would strongly prefer that mlir/ doesn’t end up being a “database” of dialects. I feel this way mainly due to the cost of maintainability/evolution.

The cost of evolving and updating API is already getting large, and will only continue to grow as time goes on. When adding new features or evolving existing ones, there is always a cost-benefit comparison between the benefit of the update vs. the cost of updating all of the users. I generally lean towards doing the right thing regardless of cost, but many people don’t and even I have a breaking point. We should really weigh the cost of having these dialects “blessed” in-tree vs. the cost of the community maintaining them. This is especially true for “frontend” dialects. If we look at TensorFlow as an example, it already has several sub-dialects and various passes/pipelines. Having a large amount of these mini-compilers doesn’t scale.

Though with many of these things, feel free to just ignore me. This problem only really mainly affects me at this point, as I end up bearing the brunt of many of the core evolutions. We should just make sure to balance the ability for the infra to easily evolve, vs. the explosion of all of the possible dialects that we could possibly have.

I don’t think anyone here is inclined to ignore you River :slight_smile:

The maintenance burden of additional code in tree is a really great point, as is the non-uniform cost paid by core maintainers vs dialect contributors. Is there any criteria or threshold that you can think of that could be used to help balance these concerns?

As someone who leads a team working on backends for multiple dialects I can see some value in having very common technologies be part of a set of officially supported dialects in the LLVM monorepo.

My primary concern is about versioning - say we want to ship a compiler that supports Numpy but want to explore lowering through a non TCP/LinAlg path. Having dialects packaged with specific versions of LLVM means that our compiler can support the NumPy dialect that is part of LLVM 11 - this pushes the burden of compatibility to the framework itself and frees backends to focus on a single set of interface dialects.

This would obviously represent a tradeoff that a framework would have to consciously make - they would be biting off some amount of legacy support.

I agree, we don’t want MLIR to become associated with being a complete compiler for a particular frontend. We want to be infra.

My main concern with not having something like a “numpy” dialect or some sort of relatively tightly coupled frontend is that we don’t have any serious correctness testing happening upstream at the moment. It’s like we’re developing LLVM but don’t have an equivalent of test-suite or clang that we can use to find and investigate correctness issues. Most of the features in LLVM development (exceptions being things like GC statepoints) are in some way testable by running test-suite or crafting an input to clang.

Of course, MLIR by its nature serves a much more diverse set of compilation workflows than LLVM, so we shouldn’t expect to be able to recreate LLVM’s exact situation. However, I believe it still needs some thought, especially as things like TCP come into the picture.

Something like a numpy frontend could stimulate a whole lot of our infra on substantial workloads against a known-good reference. In that sense, its maintenance burden can be counteracted by catching bugs and improving development velocity.

That doesn’t necessarily help the refactoring burden, which AFAICT is mostly a function of the number of lines of code in the repo :confused: I see two ways to conceptualize this problem:

Leaning on dialect contributors to do refactorings. That’s mostly a community culture problem. We want to encourage a community that feel empowered to make changes to core infra and takes that on when they see something that could be improved, even if it turns out to be a large refactoring.

At some level, the time spent for a core dev (e.g. River) to update some part of the codebase should be balanced by the continuing value that that piece of the codebase contributes to the ecosystem.

Thus, as the investment in MLIR grows, the number of dialects increase, etc. the core evolution cost increases as well, but as long as the total value of MLIR increases at the same or faster rate, then core evolution still is a useful task (that is, it is a good use of engineering resources). We could think about this as “5% of the engineering effort devoted to MLIR is devoted to core evolution”. I think it’s unrealistic to expect that the core evolution costs should remain constant for eternity or even decrease. So from this point of view there are two parts of this:

a. keeping the engineering investment in core evolution at a steady 5% (or whatever) of overall MLIR investment. (that is, the rate of investement, perhaps measured in something like “number of software engineers”). The extent to which we are successful at 1. above can reduce this number.

b. keeping the engineering investment (e.g. number of active contributors) in MLIR proportional to the number of lines of code in the repo

By combining a. and b., we arrive at a situation where the engineering bandwidth we have available for core evolution remains proportional to the number of lines of code in the repo, thus keeping needed refactorings / core evolutions manageable

Of course, all this ignores out-of-tree code…


FYI - @_sean_silva and I are going to be creating a fork and seeing if we can spend a couple of weeks to get something scoped/presentable in this area. Then we can re-assess.

1 Like

I’ll disagree here. Nobody(*) really cares about infrastructure. People care about what can be delivered.
LLVM is great infrastructure, but the perceived value is dominated by the fact that one can compile existing c++ code to high quality X86 assembly. Secondarily there is some perceived value in everything else. (portability, extensibility, new backends, new frontends, etc.) MLIR (I believe) has a bit of a value perception problem today. It’s great infrastructure, but what does it actually DO? Well, actually not much from an end-to-end perspective. In short, I think there needs to be some high-value usage of MLIR to justify it as good infrastructure and drive development. Today, the usage that is driving most development is Tensorflow, and is successful because there is so much human and organizational overlap at Google. The jury is still out somewhat to see if other users can replicate that and build things using MLIR that could not be easily built any other way (a key indicator of ‘good infrastructure’ IMO).

Secondly, I think there is a significant disadvantage in every user being completely out of tree. This means that there is less visibility of the critical problems to people working on the infrastructure. It also makes the barrier to entry bigger. Today, taking MLIR and understanding what tensorflow does with it to build a good implementation and replicating that on another architecture is a significant barrier to entry. In contrast I think there is a perception that some other ML frameworks are easier to use because they are self contained and it’s easier to retarget them to another target architecture because all of the code is (apparently) right there (**).

I think that MLIR (as infrastructure) would benefit from more rich ‘subprojects’ which apply the framework to actually DO various things in a useful way. I think that embracing these frameworks as parallel projects in-tree, similar to the way that Clang exists relative to LLVM would lower the barrier to entry and make everything more accessible, either under the larger ‘llvm-project’ umbrella, or within the smaller ‘mlir’ umbrella. A few such projects will help MLIR to become great infrastructure and enable more out-of-tree users to adopt and be successful.

(*) Nobody without an infrastructure problem
(**) Whether it is really easier in the short term or the long term is another question.


Is the discussion here really of the form “what should be an MLIR subproject vs an LLVM subproject”?

If someone built a numpy implementation using MLIR and contributed it as a new LLVM subproject, is that different?

I suspect that is actually the question.

fwiw - @_sean_silva and I are taking the approach of building a minimal numpy-based tracer/program-extractor and corresponding dialects/lowerings (i.e. a numpy compiler, albeit focused on being a simple, reference implementation). We’re building that in an LLVM fork as a top-level project and can discuss final placement/contribution of the pieces later.

The main thing I am reaching for is establishing a reference project for a use of MLIR for numerical calculations that a) can be constructed relatively cheaply, b) requires elaboration/use of a significant amount of the frontend/codegen infra, c) can be a simple reference implementation but has potential utility in its own right.

I ultimately think that LLVM needs something that hits these points in-tree for numerical programming specifically, but we can make that determination later when we have something. I really agree with a lot of the points that @stephenneuendorffer makes above, and I think this is a step to close the gap somewhat.