RFC: Graduate CIRCT to monorepo?

I was not talking about a stable API. I was talking about not breaking stuff. There is a difference.
For example, recently, someone changed a method call from having 3 parameters to having 4 parameters. So, I had to manually go though all my code adding a NULL for that extra parameter as the NULL was a perfectly reasonable default. They could have left the method with 3 parameters in the API, and just make it internally call the 4 parameter version with the extra parameter set to NULL, but no, they decided to delete the 3 parameter version. Its unnecessary breaking the API, without it really causing bloat.
They seem to manage the API correctly in other places. For example, the migration from pointer types to opaque pointers. Both are available until the old one is depreciated.
Similarly, correctly done for the old pass manager vs the new pass manager. I.e. both the old method and the new method work for a sensible period of time.
This approach permits less close linkage between components and is far more scalable.
People are free to improve their own components without impacting other components.

1 Like

I see this problem a lot in the release branch, where we are not allowed to change the API. There are many API change patches that can’t be backported (or need to be rewritten) due to what I consider to be unnecessary API changes (e.g. adding an extra parameter to a function and not preserving the old signature.) If we had better policy around this as a project it would make it easier to fix more bugs on the release branch.

Then document LLVM’s policy for supporting downstream projects.

It is interesting that you mention these as “good examples” while I consider both of them as catastrophic counter examples of what we should strive for: they are massive projects that dragged painfully for many years and still aren’t really completed today.

The counter point is that “bloat” and “technical debt” are just making the project more complex and harder to evolve/maintain for not enough added value to the project.

3 Likes

This is a fascinating and important discussion, but I would like to pop back up to the specific topic of CIRCT and one thought I’ve been mulling over.

From these comments’ sentiment, it sounds to me like there is tentative support from the MLIR community to at least have part of CIRCT upstream. This is maybe not the outcome CIRCT had in mind for this RFC, or as an incubator project in general, but what about this…

Rather than upstreaming “the CIRCT project” into the LLVM monorepo as a top-level sub-project (like Flang), maybe the “production quality” dialects and transformations could graduate to upstream MLIR. This would definitely take into account (and potentially help inform) any restructuring of MLIR itself ([RFC] Restructuring of the MLIR repo).

MLIR already has a reasonable process for adding new upstream dialects. Maybe the solution here is to go through that process for the parts of CIRCT we want upstream.

It seems like that would still achieve the goals from the original post and focus additional MLIR maintenance burden on the core CIRCT components. From LLVM’s perspective, this should add similar maintenance burden as other new MLIR dialects.

I am just speaking for myself here, so I’m curious what the other CIRCT community members think.

One counterpoint is it would be more annoying to contribute to both “production” CIRCT components and “experimental” CIRCT components, but again, I don’t think that is the end of the world. I’ve had to do this style of development while evolving the MLIR Python bindings at the same time as I used them in CIRCT, and it is just another process to learn IMHO.

Can I 1,000,000x +1 this? I don’t think opaque pointers has been nearly as rough as the pass manager, but people seem to forget that the “new” pass manager migration is not done, and while in the incomplete state it is in people have been prevented from making reasonable improvements to the “legacy” pass manager. The original pass manager was added in 2002, marked “legacy” in 2013, and is still not replaced in 2022.

There is a really difficult balance between breaking things for yucks and being able to correct past mistakes. I’m certainly guilty of leaving some past mistakes around the LLVM codebase, and I would hate to think that we’re stuck with the code “Yesterday @beanz” wrote forever because that guy was not smart.

More to the topic at hand: I think adding CIRCT will impose minimal burden to the community, and I see no reason it shouldn’t be included in the monorepo if it conforms to LLVM developer policy and methods (read: Phabricator and lab.llvm.org, not PRs).

Enabling PRs on the LLVM repository even if they are only intended to be used for one part of the codebase will make things confusing for new contributors and harder for developers who work across project boundaries on CIRCT. I’m 100% in favor of moving to PRs, but CIRCT can’t push the community to make that change.

2 Likes

Should we start another thread to discuss possible changes to how we evolve the APIs in the project? There have been a lot of good points made in this thread, and I would be interested in discussing more, but I don’t want to get too much off-topic in this thread.

This would certainly break the perception that MLIR is tied to machine learning quite dramatically!

1 Like

This does sound appealing driver to push this restructuring. And indeed CIRCT has some good components and novel usage that has/will push core representation. This may also actually trigger some additional cleanup/unification/refactoring work core side :slight_smile: I’m also not opposed to top level project alongside clang & flang, just that there are (selfishly) some added benefit for me in this option.

It does seem like a much larger discussion as to contribution model and future scaling (component/sub-sub project optionality and expectations, e.g., rollbacks for breakages in dependent projects, graduations out of monorepo etc) that is very interesting for many, important and not CIRCT specific.

I’ve been a monorepo user for some time experiencing some eventual consistency and sync to green revision projects along the way (and I’m a fan of those, allowed quite some flexibility, but let me not distract). And we have two projects in monorepo that frequently have failures. But only one of those I regularly want to just delete: and that is as they are not following practices set and (most importantly) are not responsive. Code quality, experimental vs production, etc all important but folks being active community members, consistent and responsive makes the largest difference for me. There I think CIRCT folks have shown to be willing to be actual community members rather than just users. And I agree with @beanz here wrt community impact. So +1 from me in general.

Just to be clear, I think you are referring to experience in monorepos in general that use llvm, not necessarily just the llvm monorepo? (Just trying to scope the strong language – if it were my project causing that characterization, I would urgently want to correct the situation!)

My read of this discussion:

  • There is a larger discussion to be had about evolution of the llvm monorepo. +1 to tstellar wanting to break that out.
  • There is not strong opposition to some configuration of CIRCT entering the monorepo.
  • There is a general feeling from people who have inspected that at least some of what is in CIRCT, and the corresponding engagement of its community may be a net positive if in the monorepo.
  • There may be a few concrete options forward:
    1. CIRCT moves in roughly as is as a top level project.
    2. CIRCT dialects move in to MLIR proper and we defer introducing a top level to a separate time/conversation.
    3. CIRCT dialects go into MLIR proper and we introduce a top level CIRCT project for the entire assembly (which has been likened to the same kind of thing as clang in terms of layering).

Not trying to put words into anyone’s mouth. Just trying to summarize my read so that the CIRCT community isn’t on their own to self advocate. Feel free to object to any of my points.

Yes I agree. I had a conversation with a few CIRCT folks yesterday and recommend that the next step is to split the CIRCT repo so we can discuss something concrete. I agree it should come in as a top level module in the monorepo. We should make MLIR more modular, and moving CIRCT into MLIR seems like a step in the wrong direction.

1 Like

+1 - a concrete view that has had some prefactoring done to it (and possibly an outsider friendly doc about the organization and components) would be really useful

I’m generally of that opinion as well.

There is a larger discussion to be had about evolution of the llvm monorepo. +1 to tstellar wanting to break that out.
There is not strong opposition to some configuration of CIRCT entering the monorepo.
[…]

Thanks for the great summary. About the opposition point, my reading is that at least tstellar, davidbolvansky, nlopes, davidchisnall, and I have (strong) concerns whether CIRCT should enter the monorepo. Or perhaps more accurately, (strong) concerns about whether the “graduation” will become a “case law” for other projects.
River707 has concerns about the current unbounded expansion situation.

It’s worth noting that objection in a community takes a lot courage. There is a common mindset: “I don’t block your adding X, so that when I add Y you won’t object, either”.
And at the end of the say, these people will need to do their own work. If the project moves toward a direction not they desire, they will usually give up anyway: “if the impact is small enough, I can live with it and I’ll just give up/shut up.” They have energy to bring the issue up, but not the passion to keep arguing it.

For the repository size, the repo isn’t too large right now (du -sh circt/.git => 18M), so eventually some folks with concerns may live with it (if a subset enters the monorepo).
But a line needs to drawn somewhere, otherwise a random MLIR downstream user may use the CIRCT example to ask for entering the monorepo.
Many a little makes a mickle.
From davidchisnall:

The logical conclusion of this strategy is that Mesa, rustc, ponyc, ldc, tensorflow, and so on all end up in the monorepo (the more likely outcome is that they get tired of being second-class citizens of the LLVM ecosystem and move to something else).


Speaking from experience of working on MLIR inside of a large monorepo (which had similar requirements of “you must fix everything within the monorepo before you commit”), it was very very very (maybe one more VERY?) painful. There were many times where downstream colleagues actively avoided contributing upstream because of the amount of additional effort it would take.

I work with such a huge monorepo and have done many llvm-project integrations internally and cannot help agreeing with the "very very very painful) statement.

but I do think we have to draw a line somewhere though, otherwise we potentially lose new and existing contributors.

Strongly agree.


Should we start another thread to discuss possible changes to how we evolve the APIs in the project? There have been a lot of good points made in this thread, and I would be interested in discussing more, but I don’t want to get too much off-topic in this thread.

It’s good to have a discussion, but I think we at best end up with some guideline or a check list for API changes.
Some (perhaps many in the MLIR land) are unavoidable due to bloat and tech debt concerns.
The new pass manager and opaque pointers migration is of high impact and takes significant efforts from contributors (widespread breakage if we don’t make it more smooth).
It’s simply infeasible to expand the efforts to every API change.

2 Likes

What in particular are you thinking about splitting out? For parts of CIRCT that aren’t up to snuff, I think it’s a good motivator to get things into better shape. In general, I’d rather see things not get split, but rather have more experimental aspects guarded with something like CIRCT_EXPERIMENTAL in cmake.

Thanks maskray for speaking up. My intent with summarizing was not to silence. It seemed to me that the “monorepo trajectory” discussion was not as live/blocking as you feel it is (with the final suggestion to be to separate that out).

Just to be clear, we are talking about Google’s monorepo, are we not? I am of the strong opinion that we (I work at Google and have direct experience here) let that get far-far out of hand and need to be careful about generalizing experiences from such “super sized” unplanned expansions onto the general community of LLVM – it isn’t fair to projects that have taken pains to do it right to basically be arguing against an opaque user who got the balance wrong. The scale and norms applied are radically different, and there is a large difference between the resulting experiences. But I do agree with you: it is a really problematic cost-benefit to develop llvm based software in Google’s monorepo. That’s why we removed ourself and used alternative means of integration. There are strong cautionary tales here, and it may be useful for us at some point to really discuss the approach used at Google and the resulting pros/cons: there are things to learn from our experiences, but anecdote without details that let others make a reasoned evaluation isn’t a great way to have the discussion.

Also, I personally am softly anti-monorepo as a general rule for componentized software, but I’m trying to check that bias in order to evaluate this specific case against a norm that we have established. I think we can reevaluate the norm, but we need to be careful about moving the goalposts during such a discussion (CIRCT has been operating with that norm in view, stated and agreed to for a couple of years).

I would definitely like to continue the monorepo trajectory discussion. But I also want to be fair to folks who complied with our developer policy and should be evaluated relative to it.

It would be helpful to expand on the “painful” comments a bit more.
Is there anything that can be done to make it less painful?
Maybe adjusting the architecture a little to limit the impact of changes in general?
Maybe making the interfaces between components more tolerant of changes either side of the interface.

1 Like

The pain came from placing the responsibility of updating user code on MLIR developers. Now, we can develop a process to have users update their own code, but when you’re stuck together in a monorepo, that’s difficult to do without slowing down developer velocity; it’s not possible for the user code to use an older version while the migration to a new API is performed.

Sometimes APIs will need to change and they will need to change in dramatic ways. We don’t want to end up in a situation where migrations are left incomplete for years and the project accumulates technical debt because developers don’t want to break users.

I’m not sure I agree with this characterization: that is the proximate cause of the pain and is always a factor in any piece of software. Within bounds, it is also a positive force. I think that the case we are evaluating here would actually be one of the “positive force” examples.

The root causes, from my perspective (which Jacques also referred to), stem from the scale of the monorepo, the diversity of projects that all fell under that mandate, the relative (poor) quality of a few outliers, and lack of good tools for managing sensible subdivisions (which we used to have, even in that monorepo) – to say nothing of a hard fork on build system. In other words: it is mostly situational and hard to generalize to another thing not just like that thing. I think much of this style of feedback is coming from Google engineers (or ex-Google referring back to it) who have suffered under Google’s monorepo for llvm development (please correct me if I’m wrong). I think there are lessons to be learned here that may be valuable to the community, but I refuse to believe that they are as broad brushed as we are making them out to be. A deep dive into Google’s experiences would be a fascinating piece of work if we were to be public about it. There is probably a lot to learn both ways if we were to do that. In the meantime, can we please be clear when we are generalizing Google monorepo experience to the organization of the LLVM codebase?

Have we had these negative forces in LLVM, and would adding CIRCT be a positive or negative force in that axis?

Again, I feel awkward in this point: I’m usually anti monorepo on principle but I feel like we owe a fair read to this proposal on the merits and the existing policies.

1 Like

No, I agree. And I mentioned (hinted) in my response that as long as CIRCT does a good job of sticking to MLIR good practices (which it does, and more so than some dialects in MLIR right now), I don’t see it being a major burden. Flang, for example, barely registers compared to some other in-tree dialects.

I was more trying to explain the nature of the pain to @jcdutton. Sorry, it was a bit tangential to the conversation here.

In general, I’m not sure what’s being done with respect to splitting CIRCT and the restructuring of the MLIR repo and how those things will tie together, but I would actually be happy to see some production-quality CIRCT dialects living with the other MLIR dialects.

1 Like