Coordinate LLVM commits for different project

There’re many projects developed based on MLIR/LLVM. However, the LLVM commit/version mismatch may cause problems if we want to combine some of them together.

A typical scenario is as below:
We (Bytedance AML Team) are developing an end2end compiler, to support TensorFlow, PyTorch and ONNX models, where we integrate torch-mlir and onnx-mlir to it. However, since torch-mlir and onnx-mlir update LLVM weekly or monthly. It would be almost impossible to make them fall on the same LLVM commit without coordination, which causes some difficulties for subsequent developing work.

When discussing about this, Alexandre mentioned that there could be a detailed protocol on how to advance each distinct project to a common LLVM commit. My preliminary thought is that experts could provide relatively stable commit ids weekly, named w0, w1, w2…
For torch-mlir, let’s say it updates llvm bi-weekly, then the llvm commit ids it supports are w0, w2, w4…
For onnx-mlir, let’s say it updates llvm monthly, then the llvm commit ids are w0, w4, w8…
And for the common users of torch-mlir and onnx-mlir, they could use w0, w4, w8…

The related discussion is here.
FYI @_sean_silva, I think you might be interested about this.

1 Like

I really think the right way to do this is with stable interchange formats and/or APIs, not requiring all projects to follow some particular protocol. As an instance of where this would cause issues, the integration point that TensorFlow uses is based on the instance that another team pulls into Google’s monorepo. That is based on the intersection of needs of all users of LLVM in that monorepo, mostly focused on the production C++ compiler. That’s already a huge headache (and IMO, not really the right way to do it), but attempting to add another layer on top of “you also have to pick one of these commits” is probably not tractable. And this is something that only gets worse as projects scale. Coordination is great and should be encouraged (aggregating signals on “this LLVM commit is bad, don’t use it” seems helpful in general), but I think we want an approach that is modular instead of relying entirely on that coordination

2 Likes

I also prefer the stable IR formats method. But the problem is that the IR format’s of llvm is volatile currently. To handle this, is it possible that we maintain a script to upgrade the IR to the newest format for all dialects in llvm-project?

Then as long as our own part of compiler’s llvm-commit is newer than both torch-mlir’s and onnx-mlir’s, it is always possible to upgrade the result IR of torch-mlir and onnx-mlir to match the llvm-commit of our own.

You might be interested in [RFC] A binary serialization format for MLIR :slight_smile:

+1, this is definitely a problem. As @gcmn points out we have a longer term solution that we are building towards, but we definitely need a shorter-term solution here. I don’t know what that looks like, but if it crosses multiple ecosystem projects we might want to have an ODM about it – this definitely feels like it is still at “brainstorming” level of development.

CC @jpienaar @stellaraccident @sstamenova

Timely. Everyone is hitting against this same need. I expect that @jpienaar has some important input, and I also expect that this may not be as far into the long term bucket as people are assuming.

Also agreed that there may need to be some short term mitigations and protocols that would help. Definitely open to discussing, even if just to make sure we are seeing hte same things and solution space.

Jacques and I did some whiteboarding on what it would take to get “shallow dialects” that we could vendor into our repositories and upgrade with well-defined backward compatibility windows. We sized it up and it doesn’t seem like that much work for TOSA (it depends on the new binary format and some kind of upgrade hooks mechanism probably). I have some brainstorms on how to apply this line of thinking to Torch-MLIR’s linalg path too. And I suppose that MHLO could adopt something similar possibly. I’m not familiar with ONNX-MLIR but I suspect that they could make it work too.

CC @sjarus @burmako

Glad to see the “shallow dialects” idea getting some traction. Over a year ago, I started experiencing this problem and wrote a ramble on what I was seeing and possible approaches. I concluded that the ecosystem wasn’t quite ready and not a lot of people were identifying the concrete need, so I dropped it. The difference between then and now is that we have a lot more mature pieces and full time teams/projects hitting this point, and that makes it a lot easier to talk about. This is exciting – I think that getting this part right is pretty critical for these technologies to move to the next stage of evolution.

I’m curious about what “shallow dialects” are. Could you explain it a bit here?

I’ve been thinking of this mainly as a CI problem. If we could build every version of downstream projects for every version of llvm, then it would be relatively easy to identify appropriate versions of llvm to focus on and facilitate automatic upgrading when possible, or highlight conflicting commits when not. The main problem is that this would put a significant CI burden on downstream projects. To mitigate this, I’ve been wanting to propose that the LLVM buildbot tag/mark a nightly build and that downstream projects could limit their CI burden to once-a-day. This would be unlikely to be onerous and greatly facilitate people trying to live-close-to-head. Would love to hear if people have other good ideas here, as we suffer from this quite a bit, too.

1 Like

Indeed this is something very much getting active focus right now. It wasn’t quite planned for now, but we are reordering some work as pain is getting too high/trajectory wrong. This is currently at 1 pager (e.g., 4 page doc) and whiteboarding level with some code based on initial ideas already out (and as Sean said, relying on the other community efforts, so some code needed here are being written or under review from community already).

In short a shallow dialect is (with a lot of hand waving and refinement to come) a slice of a dialect sufficient for specific goal (in this case serialization) that has less dependencies and easier to provide guarantees on (e.g., version without creating an atomic clock across projects).

100% in agreement (and the two of us weren’t even in any of the same meetings :slight_smile: ). The API work is indeed something desirable, but focusing on format part first as I think that could be big step.

As Stella says, not that long term bucket :slightly_smiling_face:

I started typing how an even shorter term solution can work, and stopped as it ends up being no better than hope with additional steps or just randomly trying in neighborhood of commits and still doing updates. Even though TF updates 2x a weekday on average, it also allows carrying patches (practical) and only relevant parts are tested. So there could be a bug which means onnx-mlir couldn’t use that revision (say) and then the one just after torch-mlir can’t compile …

I’m one of the ones who likes green nightly tags for things like this. We get a lot of mileage out of people being and to sync to a nightly version and report issues on that. I doubt everyone would follow suit, but especially if there was a script to get the latest nightly llvm hash, I bet people who aren’t already using some kind of “build at every commit” strategy would just use that.

As you say… Didn’t solve everything but it reduces the “dimensionality” of the problem a lot.

I really like the idea of having a green nightly tag for upstream. I’ve mentioned this on other threads, but we pick a ‘green’ commit every night to do nightly builds for Fedora, and it would be great if the community could standardize on a single commit every night. It would save a lot of duplicated effort.

3 Likes

What kind of testing happens in the fedora nightly? One of the things that keeps us rooting around in Google repos to pick a green commit is the level of testing that we know goes into those commits. There are never any guarantees, but it is always nice if the thing you are syncing to has some real world mileage on it. Wondering if fedora’s nightly has any fringe benefits on that front?

1 Like

Right now, we are just building and running make check. Our build system uses low power machines, so there is a limit to how much we can do in a day. We really just want to produce nightly binaries that users can easily install, and starting with a green commit, even if it’s just been build tested helps cut down on our build failures a lot.

A while ago I wrote a script that takes in a range of commits, a number of Buildbots, and finds sub ranges where most of them are green, and sorts by number of Buildbots in each sub range. I used this to find the best commit to merge our project into.

It sounds simple, but given that each bot build time is almost random, it gets hideously complex. After a number of fixes every time I had to rebase, I gave up and just took a leap of faith on visual checks.

If the bots had flags (mlir, clang, specific targets), we could more easily say things like: this target’s most stable range yesterday is A…B. With that, anyone having to merge things that worries about target X and MLIR, would just find intersections, every day backwards, until one is found.

My idea was to run that script every day and just mark the days when we had a good intersection, so then later we could just pick based on what new feature we wanted.

If this is to be used across teams that don’t normally interact with each other, then I suggest we do something similar and publish a list of stable sets based on LLVM sub projects, and one for all, on the projects that strongly interact with others.

Of course, this is getting close to releases, but with less promise and less work by a lot of people. I still want to believe it’s doable, but I have given up to do on my own. Hopefully other people have more luck than I did?

I’ve been slow to keep up with the recent conversation on the binary serialization format, but wanted to mention that for TOSA, we have this problem too - there are live projects from TF & TFLite , Torch-MLIR and ONNX-MLIR to TOSA right now.

We can sort of get away with it for now, despite all these three projects having common builder code described in this RFC: [RFC] Moving TOSA construction lib into MLIR core . I’m in the process of implementing this, but it moves all the dependencies into a single point of coordination within llvm-project and thus runs straight into this particular problem - this was one of the reasons I held off on it.

Another factor that would impact TOSA (and ONNX , Torch and TF) in MLIR is the need for production support for. multiple major versions and the backward compatibility between them. For example, TOSA 1.0 will be live soon. There’ll be a 2.0 at some point in future and it’s expected that both will be active formats with some defined PLC on the older version(s).

Note: it explicitly excludes stability/versioning right now.
This is a problem that is not automatically solved by a binary format (renaming an op in a dialect or adding a new attributes or a new type to an op won’t be automatically solved like this).
(similarly: I don’t see how “shallow dialects” would solve anything in its own either)

See also for versioning purpose this previous approach: [RFC] IR Versioning

3 Likes

I don’t think there is an expectation that anything is automatically solved, but that we can see a path that, for things we care about, we can build further gaurantees. I haven’t grokked the whole “shallow dialects” thing, but I suspect it is aiming to provide some “interface decoupling” for things we care about elevating more towards an API.

I guess I don’t even see the path then. But I’ll wait to see any proposal in this direction!