FYI: llvm-project repo has exceeded GitHub upload size limit

(The issue described here probably doesn’t affect most of the members but I think it’s still a good idea to give a heads-up for people who have a similar workflow)

If you happened to push the entire llvm-project to another new (personal) GitHub repo recently, you might encounter the following error message before the whole process bails out:

...
remote: fatal: pack exceeds maximum allowed size (2.00 GiB)
error: remote unpack failed: index-pack abnormal exit
To github.com:<your GitHub repo>
 ! [remote rejected]           main -> main (failed)
error: failed to push some refs to 'git@github.com:<your GitHub repo>'

The crux here is that we tried to push the entire repo, which has exceeded GitHub’s upload limit (2GB) at once. This is not a problem for majority of the developers who already had a copy of llvm-project in their separated GitHub repos (or simply lack such kind of copy) since incremental commit pushes are unlikely to exceed the upload limit.

If you don’t really care about the full git history and/or some other branches, the easiest fix will be using Git shallow clone to clone the upstream llvm-project tree before pushing it to another GitHub repo.

Otherwise, you might want to push the repo in parts:

git checkout <a previous commit when the project size was still under 2GB>
# Push first half of the commits
git push <remote name> HEAD:refs/heads/<remote branch name>
# Push rest of the commits
git checkout <local branch name>
git push <remote name> HEAD:<remote branch name>

For the ‘commit when the project size was still under 2GB’ , I randomly picked one from 2021. As I remember the issue only happened recently.

The repo needs repacking, I emailed GitHub support for this. They did it a couple of year ago and the repo size got divided by two.
I tried locally right now to run git repack -a -d -f --depth=250 --window=250 and the size of the .git folder went from 2417MB to 943MB…
It’d be nice if they would do this periodically for large repos.

For reference: https://support.github.com/ticket/personal/0/1673689 (I’m not sure if it is accessible to everyone in the LLVM org)

5 Likes

They only got it down to 2GB (20% reduction, better than nothing) because:

As it turns out, we only run repacks on a repository network level, which means that repacks need to consider objects from all forks of a given repository.

Repacking entire repository networks will always lead to less optimal pack sizes compared to repacking just objects from a single fork. For GitHub, disk space is not the only thing we optimize for, but also performance across forks and client performance.

Thanks for the update!

I work on multiple large monorepos that exceed 10GBs in size. I use this script git-batch-push.sh, to batch-push all commits to a new remote repository. It only needs to be run the first time since increment pushes after the first sync should be under the pack-allowed size.

I hope this helps anyone trying to push llvm-project to a new repo!