Mirroring of LLVM repository

Dear All,

Currently the load of llvm.org is too high. This influences the whole
project services like buildbots, bugzilla, etc. It was found that this
workload is possible caused by massive mirroring of LLVM SVN
repository into git/hg/whatever. Please don't do that :slight_smile:

1. Usually one don't need full history, this makes the mirroring much
easier and faster (do you *really* need r5000? I guess - no)
2. There are some git mirrors already, which can be used for
'bootstrapping' git-svn - you can really save a lot of resources using
them. They might be not so up-to-date (often they are updated daily
during late night hours for US people), but as I already mentioned -
you will have bunch of information already imported.

These mirrors are:
Main LLVM repository (starting from r40000): git://repo.or.cz/llvm.git
(see Public Git Hosting - llvm.git/summary for more information)
llvm-gcc repository (starting from r40000):
git://repo.or.cz/llvm-gcc-4.2.git
(Public Git Hosting - llvm-gcc-4.2.git/summary for more information)
clang repository (starting from r40000): git://repo.or.cz/clang.git
(Public Git Hosting - clang.git/summary for more information)

As a bonus - you'll have 'normal' names / e-mails of committers from
that mirrors.

Please think twice if you will need to mirror the stuff directly from
the main LLVM SVN repository. Basically you'll preventing people from
getting real work done (since people can't update and takes forever).

Thank you for you patience and understanding!

2. There are some git mirrors already, which can be used for
'bootstrapping' git-svn - you can really save a lot of resources using
them. They might be not so up-to-date (often they are updated daily
during late night hours for US people), but as I already mentioned -
you will have bunch of information already imported.

How do I do this bootstrap?

I tried a "git clone" + "git svn init", but the next "git svn fetch" tries
to start from revision 1.

Cheers,

Currently the load of llvm.org is too high. This influences the whole
project services like buildbots, bugzilla, etc. It was found that this
workload is possible caused by massive mirroring of LLVM SVN
repository into git/hg/whatever. Please don't do that :slight_smile:

Let me add to what Anton said:

I can only speak for git-svn, but the main burden it imposes on LLVM's
SVN repository are attempts of mirroring the _complete_ history.
Unfortunatly, a naive `git svn clone` will do just that -- so _please_
avoid this at all cost.

If you want to use git-svn to work with LLVM's SVN repository, please
only clone the latest SVN revision using git-svn's `--revision <n>`
option (combined with a huge log window: `--log-window-size 999999`).
You may find the following script helpful:

    #!/bin/bash
    if [ $# -ne 2 ]; then echo "Usage: $0 <svn_url> <dir>" >&2; exit 64; fi
    SVN_URL=$1
    SVN_REV=`svn info $SVN_URL | awk '/Rev:/ {print $4}'`
    git svn clone --log-window-size 999999 --revision $SVN_REV $SVN_URL $2

This makes the initial clone roughly as demanding as a `svn checkout`.

[..] There are some git mirrors already [..]

Next to the mirrors mentioned (and kept up-to-date) by Anton, I've been
maintaining a mirror of LLVM's trunk (only llvm, no llvm-gcc or clang)
on Github for the last few months:

- GitHub - earl/llvm-mirror: NOTE: The LLVM project now operates official Git mirrors as well: http://llvm.org/docs/GettingStarted.html#git-mirror -- An automated mirror of llvm/trunk from LLVM's SVN. Updates hourly. Release branches and tags are tracked manually. This mirror is *not* commit-ID compatible with the official Git mirrors.

This mirror is automatically updated once an hour and includes the
complete history of LLVM, back to the initial revision [1].

As Anton mentioned, you can use those public Git mirrors to bootstrap
your own git-svn repository. Here's how to do this using my Github
mirror:

    git clone git://github.com/earl/llvm-mirror.git llvm
    cd llvm
    git config --add remote.origin.fetch '+refs/remotes/*:refs/remotes/*'
    git fetch
    git svn init https://llvm.org/svn/llvm-project/llvm/trunk
    git svn rebase --local

Once that's done, you can work with git-svn just as you're used to.

Alternatively, you can keep up-to-date purely via the Git mirror by
using `git fetch; git svn rebase --local`. With that, the only time you
need to hit SVN is for committing. This is most effective with a recent
Git (1.6.1+) where you can freely intermix fetching via `git fetch` from
a Git mirror with `git svn fetch` directly from SVN (git-svn will
incrementally update it's revision map).

[1]

git clone git://github.com/earl/llvm-mirror.git llvm
cd llvm
git config --add remote.origin.fetch '+refs/remotes/*:refs/remotes/*'
git fetch
git svn init https://llvm.org/svn/llvm-project/llvm/trunk
git svn rebase --local

This one worked perfectly. Thanks!

I tried the same with the llvm-gcc-4.2 mirror, but "git svn rebase
--local" is running for
more than 1h with no output. Do I have to do something special for
git://repo.or.cz/llvm-gcc-4.2.git?

Thanks,

Hello, Rafael

I tried the same with the llvm-gcc-4.2 mirror, but "git svn rebase
--local" is running for
more than 1h with no output. Do I have to do something special for
git://repo.or.cz/llvm-gcc-4.2.git?

That's pretty strange. However this repository is large and contains
many files, maybe first run will be long due to need of regeneration
of metadata, etc.

Yes, my bad, the instructions were a bit too specific to my Github
mirror. The following scheme bootstraps git-svn and works with all
mirrors, including those on repo.or.cz:

    mkdir llvm-gcc-4.2
    cd llvm-gcc-4.2
    git init
    git pull git://repo.or.cz/llvm-gcc-4.2.git master:remotes/git-svn
    git svn init https://llvm.org/svn/llvm-project/llvm-gcc-4.2/trunk
    git svn rebase --local

That's pretty strange. However this repository is large and contains
many files, maybe first run will be long due to need of regeneration
of metadata, etc.

One difference I noticed is that in the llvm repository (the one in
github) "git branch -r" lists
a git-svn branch that is missing in llvm-gcc-4.2.

I am not sure how git-svn works. Is that branch required?

Cheers,

Yes, my bad, the instructions were a bit too specific to my Github
mirror. The following scheme bootstraps git-svn and works with all
mirrors, including those on repo.or.cz:

mkdir llvm-gcc-4.2
cd llvm-gcc-4.2
git init
git pull git://repo.or.cz/llvm-gcc-4.2.git master:remotes/git-svn
git svn init https://llvm.org/svn/llvm-project/llvm-gcc-4.2/trunk
git svn rebase --local

This works nicely, but I can only use "git svn rebase" for updating.
Is that I way I could use both "git pull" and "git svn rebase"? I
tried to adapt GitMirror - GCC Wiki, but I am lost. What
I tried was adding

[remote "origin"]
        url = git://repo.or.cz/clang.git
        fetch = master:remotes/git-svn

to the config file. It works until I do the first "git svn rebase".
After that I get

From git://repo.or.cz/clang

! [rejected] master -> git-svn (non fast forward)

--
Regards,
Andreas

Cheers,