User-created branches in the monorepo are often misused

When you create a PR from a fork, GitHub will automatically fetch your branch into the main repo. For example, for PR74911 (unmerged at the time of writing this), there is a branch pull/74911/head in the main repo.

This only happens for the branch that you explicitly open. If you have a PR in your fork based on the original branch, nothing is going to happen (and can’t really be expected to happen automatically).

It does feel like the decision was made w/o a process and it does also seem like this has a negative impact, on at least some folks.

2 Likes

That seems like overkill for a feature most people don’t actually use.

When I tested something similar for new contributor greetings, the comment ended up in the final commit (even if I used HTML comments to do it). There may be manual ways to remove it or run a specific command to land it, but not worth imposing that on everyone.

I noticed that a lot of people started using spr to create literally all their pull requests, rather than using it only for stacked pull requests.

Please don’t do that.

3 Likes

@nikic Can you make an updated overview of existing user branches? Like I did in the opening post, but more informed (I don’t know much about spr and stacked workflow).

I noticed that a lot of people started using spr to create literally all their pull requests, rather than using it only for stacked pull requests.

Please don’t do that.

Why not? In my non-LLVM work, I find it convenient to use the same tools for all my PRs, whether they end up being stacked with other PRs or not.

To begin with: because it’s policy :slight_smile:

The thinking has been that user provided branches aren’t very scalable, so it was a concession to accept these for stacked PRs: kind of we didn’t have a choice.
(maybe there something else also that I don’t know). But it had to be limited to this use case because there are many people who think it is annoying to have too many user branches here.

Do we have scalability concerns that aren’t addressed by fetch filtering (especially with negative refspecs)? I feel like this discussion would be a lot more concrete if we had specific concerns around user branches. Graph visualizers were brought up earlier; I don’t use those so I can’t comment on them, but I imagine they have options to only visualize the branches you’re after?

How should I put it… I don’t remember the need for everyone (both community and our users) to learn fetch filtering being on the table when possible solutions for stacked PR where discussed. Also what doesn’t scale is community time. We are more than 2 months into this, but promised tooling to do “garbage collection” on user branches still hasn’t arrived.

2 Likes

Fetch filtering is an okay solution for my local llvm-project checkout that I set up once and then use for years.

It’s not a great solution when you regularly create new checkouts. For example, nowadays I get to pull hundreds of user branches every time I check out LLVM on a new server for debugging or bisection purposes.

It’s also not a great solution for new contributors. I don’t generally consult project-specific documentation to learn how to clone a repo, because it’s always the same. Most people will not be aware that LLVM is a special snowflake project and needs some arcane additional git configuration.

8 Likes

Resuming the stacked PRs discussion:

  • graphite.dev was enabled for the repo and it seems it does support stacked PRs (How do stacked diffs work). Does anyone want to try this and share the feedback?
  • There is ReviewStack (https://reviewstack.dev/) that is part of Stapling. Does anyone have experience with it?
1 Like

I just had to delete these branches because I could not checkout llvm on Windows

  • users/minglotus-6/spr/main.nfcprecommit-test-case-to-show-function-summary-and-global-values-when-a-function-has-instructions-annotated-with-vtable-profiles-and-indirect-call-profiles
  • users/minglotus-6/spr/main.thinltotypeprofilingadd-annotated-vtable-guid-as-referenced-variables-in-per-function-summary-and-update-bitcode-writer-to-create-value-ids-for-these-referenced-vtables

They were not deleted when their associated pull requests were closed, only the non main. versions were.

Whatever the policy, we really need to disallow unreasonably long branch names either via a push hook or permission to delete them even if they are used in an active pr. If spr requires using unreasonably long names, then we shouldn’t use spr. It’s not ok to have git clone https://github.com/llvm/llvm-project.git not work on Windows.

6 Likes

What’s the maximum allowed ref name on Windows?

I wouldn’t be surprised if this is the max path windows limit kicking here (which makes it dependent on the prefix length of the path where you’re cloning).
It’d be interesting to see if the solution provided by the link above would work?

git clone -c core.longpaths=true

Unfortunately, this solution doesn’t make naive git clone to work, so it’s only a workaround.

The MAX_PATH on Windows is 260 including <drive-letter>:\ and a null byte at the end, so 256 path bytes. There are various ways to extend it, but there’s still lots of software that doesn’t work with longer paths, including parts of LLVM itself.

core.longpaths is not enabled by default because:

Enable long path (> 260) support for builtin commands in Git for
Windows. This is disabled by default, as long paths are not supported
by Windows Explorer, cmd.exe and the Git for Windows tool chain
(msys, bash, tcl, perl…). Only enable this if you know what you’re
doing and are prepared to live with a few quirks.

Microsoft is very slowly pushing to fix this, most recently in 2016 by adding a disabled by default (still today, 8 years later) registry key that only does anything if you also have a special application manifest to enable it for that process.

Again, git clone https://github.com/llvm/llvm-project.git needs to work on a default install of modern Windows.

3 Likes

100% agreed on this point.

I’ve recently tried to pull this (actually the release/16.x branch) into a git backed by Azure DevOps. It is simply impossible to do since there are many violations in the history that are not allowed on Windows.

For sure, there is a commit that adds a duplicate file name (with a different casing) which I fixed locally with an interactive rebase. However, there are other violations that I couldn’t figure out. In the end, I downloaded the sources of that branch and put that in with a single commit as preserving the history was just too much work.

Supporting Windows file paths without using such a machine is difficult. Maybe the file system rules can be checked with a git action?

I just successfully cloned release/16.x branch on Windows using git clone --branch=release/16.x --single-branch git@github.com:llvm/llvm-project.git, so I’m not sure why you faced such problems. We also have Windows CI and binaries from release testers, which are cloned and built on Windows, so such severe issues would’ve surfaced back then.

CC @hansw2000