User-created branches in the monorepo are often misused

Currently we have 104 branches open in the monorepo:

  1. main
  2. 44 release branches
  3. 8 revert branches
  4. 41 user branches from 8 users who use SPR
  5. 10 user branches from 6 users that apparently don’t use SPR [1]

I find the rate of people who create user branches but don’t use SPR alarming, and would like to bring attention to this fact. I’m also wondering what are those 8 revert branches are, and why are they allowed. Counting branches, we have 18 branches that shouldn’t have been created in the first place, over the course of a month (presumably; I’m going to expand on this below).

It’s not my intention to put any blame on creators of those branches, as git push doesn’t tell anyone how user branches are supposed to be used. But I’d like to call decision to enable branches in monorepo premature, which apparently was documented as a post buried in a long thread about GitHub PR. It wasn’t an RFC, and the way it’s phrased (“We have user-branches opened in the repo”) doesn’t give a clear message when this actually happened.

I recognize that some projects in the monorepo need a solution for stacked PRs, and that there’s no good-all-around solution, but I see damage being done. One of the forms this damage manifests itself in is various git graph visualizers, that are becoming harder to comprehend as number of branches goes up. I’ve worked in monorepos with thousands of branches, and definitely don’t want LLVM to end up there.

If we really need users to be able to create branches, one of the ways forward might be to have an issue opened, where users can do that via a bot. This both gives us opportunity to tell people what branches in the monorepo are for, and gives a good place to ask about stale branches.

I propose to phase out ability to directly push branches in a timely fashion (think weeks, not months), as something that was done without a proper RFC, and is actively harmful. Then we can work on a solution that doesn’t make the whole community pay for something that is used by a rather small subset of contributors.

[1] I don’t know much about SPR, so I’ve been looking for spr in branch names and commit messages. This might not be the most way to count, so corrections are welcome.

CC @mehdi_amini @tstellar @akorobeynikov

3 Likes

I’ve been using user-branches to manage my stack manually, without relying on SPR or a tool. I haven’t looked at what people are doing here.

Also we documented some guidelines: LLVM GitHub User Guide — LLVM 18.0.0git documentation

It is possible to create branches that starts with users//, however this is intended to be able to support “stacked” pull-request. Do not create any branches in the llvm/llvm-project repository otherwise, please use a fork (see below). User branches that aren’t associated with a pull-request will be deleted.

Have you tried the setup mentioned in “getting started”? Getting Started with the LLVM System — LLVM 18.0.0git documentation

You are likely only interested in the main branch moving forward, if you don’t want git fetch (or git pull) to download user branches, use:
sed 's#fetch = +refs/heads/\*:refs/remotes/origin/\*#fetch = +refs/heads/main:refs/remotes/origin/main# -i llvm-project/.git/config

It is supposed to make it so that git fetch / git pull will not download any of the user branches, and they shouldn’t pollute you viewers hopefully.
Most viewers can also be configured to filter out branches based on naming scheme, I don’t know which tool you’re using right now?

Without a usable solution for stacked PRs, that does not seem like a good path forward to me. There are other mitigations like the one that I mentioned above that should be able to address the small annoyance that comes with it.

4 Likes

As you can see for yourself if you disable fetch filter, not everyone is using branches in the way intended.

Unfortunately, I’m well aware of existence of fetch filtering. People have enough struggles with git (one is people trashing their PRs rebasing the wrong way, which is a daily occasion now, it seems), so that I’m not comfortable moving in the direction that makes this advanced feature more necessary to have good experience.

If stacked PR needs are so compelling and universally shared to enable direct pushing of branches to the monorepo, you should have easy time gathering consensus on that, overcoming previous resistance.

Feel free to propose an alternative: I tried to narrow it down to “run this single one line command and you’re all set”. That is a “one time setup”. Also it’s not clear to me how much right now that this a “bad experience”: the filtering isn’t “more advanced” to me than the use of a “graph visualizer” for the history (especially considering that we have a linear history), but I may be missing something. Maybe you can elaborate on this?

Otherwise, please find a more constructive approach: right now I feel you’re overblowing the effect of a recent change that hasn’t stabilized yet, and jumping to an extreme without trying to find solutions. We can do better.

3 Likes

Sorry, but I’m not going to engage in fetch filter discussions further, even though I have points to share, because that would derail the thread. We can come back to this later, though.

We should agree to disagree here, but If you were the one to make this change, can you elaborate on decision-making process? I’ve been following discussions around stacked PR, albeit not too close, and don’t remember seeing issues I present here considered and given proper response. I remember many members of the community acknowledging that stacked PR is a problem worth solving, but I don’t remember consensus on “if you have to enable user branches in monorepo, you’re free to do that”. What I remember is this topic being contentious.

Every time one is pushing “Revert” button the revert branch is created and corresponding “revert” pull request is created as well. When revert is landed then the branch is automatically removed.

Stale revert branches means that someone pushed the “Revert” button in the UI, but closed / deleted / not created the corresponding revert PR.

If we’d prohibit revert branches to be created we’d effectively turn off “Revert” button in the UI which might be a usability issue.

TIL, thank you!

I’m fine with revert branches, knowing that they are created as a part of a GitHub workflow useful for everyone, and have a PR attached, which means we have a good place to talk about the branch if there are problems.

Actually, I was able to create stale branch here. If I’d click “Revert”, but do not create corresponding PR, then the branch will be there and not attached to any PR. We’d probably need to clean them from time to time.

3 Likes

Are you proposing this as a one-time step per user, or as a process to do every single time you want to create a branch? The latter would be totally unworkable for stacked pull requests.

We’ve discussed stacked PRs extensively at this point, and the conclusion was that the only way to get a Phabricator-comparable experience (which many people were after) was by allowing branches on the main LLVM repository. I think allowing them under a specific namespace is a reasonable compromise to enable stacked PRs.

3 Likes

We can do either that is considered better. It wasn’t much more than an idea how can we move forward.

It’s not like I have a strong disagreement with any point you mentioned, but I think you are highlighting that the decision to allow branches, which affects everyone, was made in an opaque way by a small group of people actively participating in long threads about stacked PRs, which I remember being scattered across different platforms. The fact that fetch filters were added to our “Getting Started” guide makes wide impact of this decision even more apparent.

That’s why I’d like to see this decision going through proper RFC process, where the community would have a chance to participate and (potentially) agree that this is indeed a reasonable compromise.

I have in my todo list to write a GitHub action that lookup branches and figure out if we have some without PR associated that should be deleted. I’m a bit overloaded, if someone wants to give it a try it would be a useful thing to have!

@nikic reported the misuse problem on Discord 3 days ago

Is it possible to make the rule for allowed branches more specific, so it only covers something like users/*/spr* ? Unfortunately, a number of people are just pushing all their pull requests to the main repo now, even though they don’t use stacked pull"

I agree that we should do something to prevent misuses.

3 Likes

Getting a bit off topic here, but do you know how this command should look like for people who need both main and the release branches?

Some people responded to this by saying they use users/* branches for stacked pull requests, but without using spr, though they seemed open to just including the spr bit in their branch names, even if they don’t use the tool.

An alternative to using something like users/*/spr* would be to rename the namespace to stacked-prs/* or similar. That would avoid endorsing any particular tool, while also making it abundantly clear that this namespace is reserved for stacked pull requests only, and not arbitrary user branches.

6 Likes

We could create a brief announcement post emphasizing that in-repo branches are for stacked PRs only, and pin it to the front page of discourse.

IIRC, there is a way to add default text to the PR description in GH: we could add a comment reminding users about the use of stacked PRs.

One scenario may be that someone pushes a branch to the main repo while they’re still working on the next PR in the soon-to-be a stack. This allows reviewers to look at the first step before the next one is ready for review.

1 Like

It’d probably be better to document how to do this with negative refspecs rather than updating the default fetch spec, ie:

git config --add remote.origin.fetch '^refs/heads/users/*'
git config --add remote.origin.fetch '^refs/heads/revert-*'

Then your git config will look like so:

[remote "origin"]
	url = git@github.com:llvm/llvm-project.git
	fetch = +refs/heads/*:refs/remotes/origin/*
	fetch = ^refs/heads/users/*
	fetch = ^refs/heads/revert-*
6 Likes

I thought that was already possible via rulesets.

@kparzysz

One scenario may be that someone pushes a branch to the main repo while they’re still working on the next PR in the soon-to-be a stack. This allows reviewers to look at the first step before the next one is ready for review.

Why it has to be a branch in the main repo though?

You can’t have a PR in one repo that is based on a branch from another repo.

1 Like

That’s how PRs work? They are based on branch from the fork (another repo). At worst, you’ll have to create extra branches in your fork, but who cares - it’s your fork, do what you want with it.

The next PR in the stack would have to be in the fork, not the main repo. We want the reviews to stay in the main repo, so that won’t work.