RFC: Graduate CIRCT to monorepo?

While presence of MLIR could be justified, I agree with @nlopes as these “welcome” activities could damage the LLVM project itself.

Putting more and more projects in monorepo would introduce a big pressure for LLVM developers (very hard to land something nontrivial) + noise and they would reconsider any upstream contributions.

3 Likes

+1

aha well then, that explains it!

Much more reasonable, thanks.

I don’t understand what you mean here. CIRCT is tightly integrated with MLIR which is part of LLVM, integrates with all the core data structures, and even have paths that generate LLVM IR. Are you distinguishing LLVM from MLIR?

As others say, CIRCT is directly analogous to Flang or Clang. LLVM doesn’t depend on Clang, but clang depends on LLVM.

Ok, well you have a very specific definition of LLVM, which sounds like llvm the top-level directory, not LLVM the project. IMO, we need to “demonolithicize” the llvm TL directory, e.g. split out adt/support and the targets. I don’t agree with you that getting “stuff you don’t care about” out of the repo is a good way to build a scalable community. By your argument, we should split clang out as well because not everyone cares about C/C++.

The Linux kernel community is much larger than the LLVM community and seems to be scaling. Are there lessons or tooling from that community that we’d benefit from learning about or adopting?

-Chris

The question isn’t really is CIRCT enough like Flang or Clang to be included int the monorepo. It’s more of why do we have CIRCT, Flang, and Clang all in the same repo. In my opinion, just being tightly integrated with LLVM or MLIR is not enough of a reason for a project to be added in the monorepo.

“LLVM is a collection of modular and reusable compiler and toolchain technologies.” I don’ t know if we have a mission statement, but this seems close, and it’s how I explain LLVM to people who are unfamiliar with it. However, if it’s not possible to integrate with LLVM without being part of the same repository then I think that means there are some fundamental problems that we need to solve.

I think it’s useful to look at how the kernel does development, but for this project just because we don’t add every project to the monorepo doesn’t mean we are failing to scale. There are ways to have CIRCT be part of the community and not have it in the same repo as everything else. We just need to figure out the best way to do this.

The kernel uses a pretty radically different contribution model, does it not? Afaiu, it is much more about key people owning trees and they are authoritative for merging their parts in and up (up to Linus at the top). That kind of model almost certainly provides a different “slope” to the tendency to sprawl.

I’ve lived a long happy life by not engaging too deeply in monorepo-vs-other arguments, and I definitely don’t want to relitigate this every time someone wants to add something. But I am a bit worried about the trajectory: we are on a course for something quite large and unwieldy without any of the usual cutouts or speedbumps that would maintain separability and optionality.

+1

+1 - I’ve been a fly on the wall in the CIRCT project since inception (largely by way of my role of “making weird things build”). I generally believe that the use of MLIR in CIRCT and the overall development process has been a bit higher quality than parts of MLIR itself and I think having it would be a net positive in terms of more examples of “the right ways to do things.”

Blocking? No, but close IMO. I think we have a fair number of contributors who aren’t the slightest bit familiar with phab and we’d miss out on their contributions if we didn’t accept PRs. (Not to rehash the PR vs phab debate here.) I am concerned that any experience regression will result in a loss of some of our existing contributors.

Yeah, we’ve put quite a bit of effort into the GH actions pipelines. I don’t know anything about buildbot pipelines, but my impression based on the phab buildkite pre commit checks is that the LLVM build pipelines are ad-hoc and messy. Again, I’m not necessarily tied to the GH actions, but something more modern than buildkite is (IMO) pretty necessary.

Add Microsoft to the list of companies using CIRCT! We just started using CIRCT in “production” (meaning devs rely on it to build).

+1000.

Changes to MLIR break the CIRCT builds constantly (pretty much every time we bump the submodule) so CIRCT is tightly integrated with MLIR. I agree that MLIR isn’t tightly integrated with LLVM (it’s a one-way dependence), but how often does a change in the LLVM core infrastructure break MLIR? (I honestly don’t know but that is important data.) I don’t recall a change in the core LLVM code which broke CIRCT and not MLIR.

+1. Though I’m less worried about overall git repo size (esp. the addition of CIRCT, which is miniscule).

I’ll also note another potential wrinkle: CIRCT has a frontend in its repo which is pretty closely tied to two of the dialects (meaning that when I add something to one of those dialects, I generally want to expose it through the frontend). Managing it separately from the rest of CIRCT would be a major pain for me.

Note: I want to be clear that most of this post is about my problems with unbounded expansion to the monorepo in general, I’ll leave Circt specific things to another post.

I share similar concerns to many others about the growth of the monorepo. I think a line has to be drawn somewhere for adding more top-level projects. Having a large, vastly interconnected monorepo is very difficult to achieve without a large amount of tooling and resources (to which LLVM doesn’t really have at this state, not that this is a bad thing). Speaking from experience of working on MLIR inside of a large monorepo (which had similar requirements of “you must fix everything within the monorepo before you commit”), it was very very very (maybe one more VERY?) painful. There were many times where downstream colleagues actively avoided contributing upstream because of the amount of additional effort it would take. Stagnation and discouraged evolution is a bad sign for a project IMO, and maintaining the ability to actively improve the codebase and remove technical debt has to be held in the highest regard. I won’t claim that Circt will be the one to make development untenable, but I do think we have to draw a line somewhere though, otherwise we potentially lose new and existing contributors.

I think this applies to every project using LLVM or MLIR though, and I would almost set this aside when considering adding a project to the monorepo. The reason why I would say that is because this is shifting all of the burden once held by a select few developers onto the wider community. The problems of aligning dependencies and tracking updates are important, but it isn’t a scalable solution to move everything into the monorepo. There is a real and significant added cost here that compounds over time.

I share the same strong concerns as everyone else about any sub-project diverging from using phabricator (or whatever else the rest of the project uses). Having one part of the project use different tooling will inevitably result in multiple ways that patches get submitted/reviewed, and inevitably place a burden on everyone. For example, as an MLIR developer if I had to fixup problems with Circt as part of an MLIR change, I would include them as part of the phabricator review. Alternatively, I would probably want any changes to MLIR be driven through phabricator like the rest of the codebase. This means that any cross cutting change would be driven through phab, which creates friction for circt developers and also requires actively ensuring that certain parts of the codebase use the proper tooling.

Definitely agree here (unless I’m misinterpreting). I think we need to have a better understanding and guidelines on what it means to be an “LLVM Project” and what it means to have something “Graduated into the Monorepo”. I think there is a current failure that the characterization of success of an LLVM sub-project means that it has to move into the monorepo at some point. I don’t think LLVM is set up for this right now, though, and especially if we keep the current expectation that “everything at HEAD is green”(might be wrong, but this has applied to the parts of the codebase I’ve contributed to) it just doesn’t scale. I think better characterizing this would really help smooth over similar discussions in the future. I don’t think it will eliminate them, because these things are always case by case, but hopefully it would help.

(I hope this didn’t come off as negative about circt, I think circt is an amazing project. I just have a lot of scars and trauma related to monorepos in the past, and would absolutely dream of avoiding them in the future)

– River

1 Like

Now that I’ve unloaded my past monorepo trauma…

I’m also quite interested in this as well. I think it would be good to characterize the amount of work that would be shifted to the community. For pure code fixes, I suppose looking at the circt submodule bumps would help. There is also the matter of build times, the configurations required to actually test things properly, etc. From an MLIR perspective, flang has been relatively painless for me, but it’s also likely significantly less API surface area compared to circt.

+1 here. It would be nice to have a clear separation between what is expected for the community to maintain, and what is experimental/not of sufficient quality/stability/coverage/etc.

– River

Leaving aside monorepo angst, I see this as a very strong point of the proposal. I think we need to audit whether all of CIRCT would come in or be chopped in some ways (some parts seem more “integration-ey” than others), but personally, I think MLIR would benefit from such a user as this in-tree. The maintenance cost/benefit goes both ways: too little in-tree usage and it is hard to understand the impact of changes and breadth of actual usage. Too much and many of us have scars from past experience on that.

Personally, I think that at least some portion of CIRCT, using MLIR canonically as a frontend for an important domain would actually help us get to more of a “just right” place with respect to cost/benefit.

2 Likes

One of our engineers has picked up skills on git history rewriting that may prove valuable before we actually do a merge – just to make sure things are tidy. In another repo, we picked through and were surprised at some of the gunk we had from the early days before the project structure was stabilized. Big moves like this, where you are already breaking the development flow can be a good opportunity to browse through and tidy up. Let us know if you’d like help/scripts/etc.

2 Likes

One benefit of the mono repo is that you can do atomic updates on clang and llvm.

1 Like

+1.

Dev teams behind new projects (which would like to be added to monorepo) would heavily decrease their engineering cost to keep their projects up to date with upstream LLVM/MLIR, but in reality those costs would be just shifted to core MLIR/LLVM developers teams.

For example, a MLIR developer changes some core MLIR API (and here we can assume just trivial API change) → now the developer needs to fix all call sites in CIRCT as well (so setup everything for CIRCT, build it, test it). There are efforts to use github pull requests instead of phab to make things “easier” for developers (questionable, but let it be), on the other hand, we introduce a lot of work for core developers; force them to spend some time to work on unrelated projects … or not contribute anything at all.

As many people said, the LLVM project needs to draw a line.

1 Like

Thanks, yeah, I think that would be helpful when if and when it comes time to clean up the git history.

A number of people have raised concerns about the growth of the LLVM monorepo. I just wanted to chime in on the other side, to say that I’ve not personally had problems with it (admittedly, only really working with LLVM and Clang). It’s true that mid-air collisions are possible when rebasing+pushing, but given LLVM’s velocity this is going to be an issue even if llvm/ is split out (and there are solutions to consider that don’t involve splitting out into smaller repos, e.g. handing off the final merging to a bot a la bors).

To be clear, I’m not trying to dismiss those who have had issues, just thought I’d say a few words on the other side. I should also say that LLVM is the largest monorepo I’ve done much work with, so others are sure to have broader experience.

Before declaring the monorepo “full”, could there not be a more extensive exploration of possible development models (other than “the kernel model doesn’t work 1:1 here”)?

For example, what if there was a step between an incubator and “any commit that breaks the new subproject breaks all of us” - say, the subproject still could become a top-level folder in llvm-project and yet still use it’s own development branch (i.e. not main) + CI. This would largely absolve other contributors of both the higher commit traffic and the CI breakages that people are worried about.

It would then put the responsibility of staying up-to-date with main onto the subproject, and similarly for the merge-back-to-main** – a bit like a materialized git submodule. In that way, a subproject could still gain most of the benefits of being in the monorepo, without burdening other contributors.

The last graduation step (if ever?) would then be when the subproject has stabilized to the point that the community is comfortable with changes going into main directly, with all that entails.

There are obviously many variations on this scheme, and the above is but a sketch, yet I feel this topic could be beneficial to explore in more depth.

** this would obviously be helped by having GH PRs (and allowing merges of larger series of commits), but IMO the flang upstreaming effort shows that going through phab for someting like this is still feasible.

1 Like

I think that there is a lot of conflation of three largely independent things in this discussion and it would be worth disentangling them:

  • Should CIRCT be part of the LLVM umbrella? This one is already addressed, it is in the GitHub org.
  • Should CIRCT be integrated with LLVM CI infrastructure such that we get automatic notifications if an LLVM commit breaks CIRCT?
  • Should the code for CIRCT live in the same git repo as LLVM?

The second two are completely separable concerns. There should be nothing stopping any project that consumes LLVM libraries from having CI that automatically updates the LLVM submodule on commits to LLVM and flags commits that have broken things. It’s a project policy question, not a technical question, whether an LLVM commit that breaks CIRCT should be reverted, should be expected to come with a fix for CIRCT, or should just let the CIRCT developers know that they have some more work to do. This is a trade-off between the extra developer load from LLVM developers fixing bugs that don’t affect their use cases and the extra benefit from a broader testing environment and I don’t know enough about CIRCT to have opinions here.

I was one of the people (roughly 50%, from the survey) in the community who did not want the monorepo in the first place and has suffered from the fact that it exists and I have to clone a huge repo to make a change to libc++, so I have some pre-existing biases here.

Most of my objections to the monorepo in the first place also apply to adding a new project, independent of the merits of CIRCT:

  • It adds burden to the people who are contributing to the smaller subprojects.
  • It encourages tight coupling between LLVM components, which is antithetical to our goal of producing reusable modular libraries.
  • It emphasises the two-tier ecosystem where projects that build on LLVM but are not in the monorepo are considered second-class citizens and should expect to not compile with a random trunk revision if they built correctly with the last upstream release. This harms LLVM both by making it less likely that projects will adopt LLVM and by encouraging downstream consumers to test only after releases are branched and so we don’t get good testing from things outside of the monorepo.

It seems that CIRCT is suffering from the second two points and, rather than trying to fix that problem systematically for downstream consumers, the proposed solution is to increase coupling. This does not feel sustainable. The logical conclusion of this strategy is that Mesa, rustc, ponyc, ldc, tensorflow, and so on all end up in the monorepo (the more likely outcome is that they get tired of being second-class citizens of the LLVM ecosystem and move to something else).

2 Likes

I think your breakdown and analysis here is spot on. For me, the discomfort is more about increased coupling and the negative ecosystem effects from that being the norm. I have always been puzzled by llvm’s built in contradiction of being focused on modular, reusable libraries/etc, and what I have observed as a systemic disregard for the projects using it (I would soften your first class/second class characterization, but generally agree with the sentiment). That is just not how software library projects run: they can still move fast and frequently break things if needed, but there needs to be accepted community norms around making choices that limit the impact of that where possible. The difference is in the details of how the project is run over time vs any one thing (it is more about instilling biases vs rules). The project itself needing to cope with the pain of using itself is a way to make sure the bias is present.

With all of that said, in this one case, I still think that at least some of CIRCT’s dialects rise to the level of being in the monorepo and would be a net gain to both the project and the domain: no matter what the development model, limiting diamond dependency skew for interop components (which is what at least some of CIRCT is) is a worthy thing to look at in its own right.

+1 - if we are primarily focused on coupling, this is separable from whether the code lives in one git repo or multiple.

Where I work has a somewhat (in)famous monorepo and supporting automation platform to keep it green, and what I have observed over the last ten years is that people who mainly know that present state forget how such things got to where they are and that there are perfectly reasonable middle points that are suitable for different types of situations, even in the quest to full on at head consistency (which may or may not be a worthy goal). As llvm grows, I would rather see it evolve tools and processes that do not require global, single commit consistency – I just don’t see how that scales given the state of the code, community, supporting tooling, and amount of money/time available for the level of infrastructure needed to make that work for real.

As just one example: we couldn’t always rely on single commit consistency and even in the cases where we had the technical ability to do so, in the early days it was often not worth the cost for a specific project to buy in. We lived with “sync to green”, milestone commits, carveouts of core vs leaf dependencies, and eventual consistency – and the ability to do so was driven by a few pieces of infra that all of the devs used. We built big, successful teams and products for many years in such a state. It works, and you just deal with some of the inconveniences. The reason I’m mentioning this is because there is a lore that has built up about this topic and some of that derives from my employer’s case and activism. It was just one path of many and we got the job done without the fabulously expensive global consistency that often gets conflated with “at head” development.

The cost of just pushing more into the monorepo (with global consistency) when there is friction is that we never grow those processes and development model refinements. For things that rest in that justification, I think that as painful as it is, the friction is working as intended (and I do manage multiple downstreams and acknowledge that the pain is real and we should look to limit it with tooling and process investments).

[quote=“stellaraccident, post:37, topic:61890, full:true”]

I think you make some good points here.
I think it is not so much the close coupling, as the API between the components keeps being broken.
I think the Linux kernel development module could apply here in its statement “never break user space”. We could apply this to the API between LLVM and all the sub-projects.
I have seen cases where a new LLVM API call is added and the old one deleted that is completely unnecessary to delete. One could have left the old one in there without any problem.
If LLVM was actually modular as it pretends to be, the wish for everything to go into a huge mono-repo would be unnecessary.

It’s important to note that this was really a by-design part of LLVM from very early on - having a stable API means you get stuck with bloat that has to be carried for a long time (assuming you follow some version of semver). Heck even the C API wasn’t that stable until Eric and others took a run at stabilising it a number of years ago.

It causes us (Unity) some pain when updating, but I think that pain is worth it so that the core LLVM contributors can keep making the technology better.

3 Likes

My personal opinion: discussions around this in llvm tend to fall into all or nothing thinking, when that is not actually the reality. That is why I phrased it as biases to instill, not rules. As you note (and there is more evidence of in other parts of the project), when care is taken, the churn can be reduced a lot. And you can scale care-taken-by-a-few-individuals into norms, often with development process, tools, some experience, and good expectations/communication around what is under active development and design.

I think most of us want the software to not be debt laden, and I support many patches towards that end, even though painful. I just think there are middle roads and llvm is at a scale where we need to find them.

3 Likes