A Fresh Start with LLVM

Hi LLVM Devs,

I recently finished working for Intel/Movidius, and thought that before I start working on some new LLVM project, that this would be a good time to discard all of my old practices (which began with v2.7, and has gathered crud over the years) and restart with a brand new fresh LLVM approach directly from head.

In preparation for doing this, I would like to know what is the current status of using GIT vs SVN and should I start afresh with the GIT repositories? There is also the issue of Mono vs Multiple repositories, and which I select will be somewhat dictated by recommended best approach, and how big the Mono repository is to clone for the first time as I have ISP download caps to contend with. I would prefer to have a Mono installation, and that way be able to track future development of all LLVM projects; but I also need to be able to enable and disable subprojects cleanly as I need them - for instance, at this time I am not yet ready for LLD and I don’t need DragonEgg, so although they are in the Mono repository, I need to be able to configure my build to exclude them.

Mostly I expect that I will be working on cross-compilers for embedded systems, so cross-compilation of the libraries is important. Historically I have done this with my own hand-crafted build systems (for LibC++ and Compiler-RT), but would like to do this with the integrated LLVM prescribed approach when possible.

For testing I have never used the LLVM test-suite, nor the LIT and LNT frameworks, but in a fresh context I would like to get these up and running as soon as possible. My primary development platform is Windows, with various Linux distros for verifying my development. And if possible, I would like to construct a private BuildBot for each target I am working on that mirrors the LLVM community BuildBots - but this is also something I have never done. To date, all of my testing for cross-development systems has used bespoke test harnesses and I would like to learn how to run the standard testing too; especially on Windows (8.1 and 10).

Advice on getting set up with a fresh start would be greatly appreciated, as well as Newbie advice for how to test LLVM since in this regard I am a newbie.

Thanks,

MartinO

I recommend using https://github.com/llvm-project/llvm-project-20170507 if you can spare 1.1 GB of disk and bandwidth for the initial checkout and git repo itself.

It’s just a few minutes behind the svn master copies. I don’t know of a better monorepo at present.

Although everything is there, things such as clang and compiler-rt aren’t actually built unless you saymlink them into the appropriate place in the llvm directory.

If you want to actually submit patches then you’ll need to make patch files and send them to the svn master.

I recommend using https://github.com/llvm-project/llvm-project-20170507

if you can spare 1.1 GB of disk and bandwidth for the initial checkout and
git repo itself.

It's just a few minutes behind the svn master copies. I don't know of a

better monorepo at present.

Although everything is there, things such as clang and compiler-rt aren't

actually built unless you saymlink them into the appropriate place in the
llvm directory.

There's an updated process for getting this done, supported by the CMake
configurations.

See
https://llvm.org/docs/GettingStarted.html#for-developers-to-work-with-a-git-monorepo
for details.

If you want to actually submit patches then you'll need to make patch

files and send them to the svn master.

There's a way of doing this through the monorepo with the scripts that are
already in the llvm project. See the link above too for details.

In particular, I encourage everyone to use the Phabricator installation and
the pre-commit review process as well.

I also encourage everyone to give the monorepo process a whirl, as it's
been getting much better and easier for projects that need to make changes
across the various repositories at once.

Cheers

Hi LLVM Devs,

I recently finished working for Intel/Movidius, and thought that before

I start working on some new LLVM project, that this would be a good time to
discard all of my old practices (which began with v2.7, and has gathered
crud over the years) and restart with a brand new fresh LLVM approach
directly from head.

In preparation for doing this, I would like to know what is the current

status of using GIT vs SVN and should I start afresh with the GIT
repositories? There is also the issue of Mono vs Multiple repositories,
and which I select will be somewhat dictated by recommended best approach,
and how big the Mono repository is to clone for the first time as I have
ISP download caps to contend with. I would prefer to have a Mono
installation, and that way be able to track future development of all LLVM
projects; but I also need to be able to enable and disable subprojects
cleanly as I need them - for instance, at this time I am not yet ready for
LLD and I don’t need DragonEgg, so although they are in the Mono
repository, I need to be able to configure my build to exclude them.

Mostly I expect that I will be working on cross-compilers for embedded

systems, so cross-compilation of the libraries is important. Historically
I have done this with my own hand-crafted build systems (for LibC++ and
Compiler-RT), but would like to do this with the integrated LLVM prescribed
approach when possible.

For testing I have never used the LLVM test-suite, nor the LIT and LNT

frameworks, but in a fresh context I would like to get these up and running
as soon as possible. My primary development platform is Windows, with
various Linux distros for verifying my development. And if possible, I
would like to construct a private BuildBot for each target I am working on
that mirrors the LLVM community BuildBots - but this is also something I
have never done. To date, all of my testing for cross-development systems
has used bespoke test harnesses and I would like to learn how to run the
standard testing too; especially on Windows (8.1 and 10).

Advice on getting set up with a fresh start would be greatly

appreciated, as well as Newbie advice for how to test LLVM since in this
regard I am a newbie.

Thanks Dean and Bruce.

1.1GB is a "lot" smaller than I expected, my worry was that it might be >60GB with the entire change histories to v1.0. Disk space is not a problem (at ~€80 per TB) just ISP download caps and 1.1GB is well under the radar :slight_smile:

I will get Phabricator set up for collaboration.

Thanks again for your help,

  MartinO

Quick additional question. From Windows do you know if TortoiseGIT works well with this configuration, or would I be better using Linux for interaction with 'git'? I have found executable permissions can be a painful issue with TortoiseSVN, and command-line Cygwin is often better in this regard. I suspect that TortoiseGIT has similar issues.

Thanks,

  MartinO

Yes, it’s not bad. You can actually reduce the size of the .git directory to 597 MB by running “git repack -a -d -f --depth=250 --window=250”. This takes less than 5 minutes on a 16 core Xeon. Unfortunately I’ve never found a way to get such a nicely packed repo into github such that it checks out for others as nicely as it was when I uploaded it :frowning:

Yes, it's not bad. You can actually reduce the size of the .git
directory to 597 MB by running "git repack -a -d -f --depth=250
--window=250". This takes less than 5 minutes on a 16 core Xeon.

You can also svn checkout any GitHub branch if that's something that
you might need.

https://help.github.com/articles/support-for-subversion-clients/

Disk space won't be saved this way because svn doesn't have
compressed pack files. Interesting enough a checkout of trunk is
1.6GB but doesn't need to transfer anywhere near 1.6GB. Subsequent
updates will be fast, but I'm sure you can use git history so this
isn't practical anyway, I guess.

Unfortunately I've never found a way to get such a nicely packed
repo into github such that it checks out for others as nicely
as it was when I uploaded it :frowning:

Maybe the reason is that git shares the underlying bare repo between
all forks. Just speculating. You can confirm this by loading a ref
that you would think only exists in your fork's branch but actually
can be accessed via any of the forks urls.

​svn keeps a complete uncompressed copy of the checkout in the .svn
directory, so it can figure out what you changed.​ That's, yes, 795 MB at
the moment (github monorepo is 1.1 GB plus 795 MB of checked out src, for
1.9 GB of total size).

If you use svn+ssh then hopefully ssh is at least gzipping that for the
transfer. A .tgz of the src (after moving the .git repo out of the
directory) is 111 MB.

It's quite remarkable really that svn uses as much space for a single
uncompressed copy of the source code as git uses for the entire project
history.

> Yes, it's not bad. You can actually reduce the size of the .git
> directory to 597 MB by running "git repack -a -d -f --depth=250
> --window=250". This takes less than 5 minutes on a 16 core Xeon.

You can also svn checkout any GitHub branch if that's something that
you might need.

https://help.github.com/articles/support-for-subversion-clients/

Disk space won't be saved this way because svn doesn't have
compressed pack files. Interesting enough a checkout of trunk is
1.6GB but doesn't need to transfer anywhere near 1.6GB. Subsequent
updates will be fast, but I'm sure you can use git history so this
isn't practical anyway, I guess.

​svn keeps a complete uncompressed copy of the checkout in the .svn
directory, so it can figure out what you changed.​ That's, yes, 795 MB at
the moment (github monorepo is 1.1 GB plus 795 MB of checked out src, for
1.9 GB of total size).

Wow, thanks, something learned :).

This shows how long it's been since I used svn as developer. I recalled
that the copy was a plain file and one could look at the files. But
that was a long time ago on Windows, parameters that might have
influenced the state or my memory is blurry there. Good to know and,
yes, sad it's inefficient.

For a tree to build from, I still prefer an svn checkout because the
size on disk will be constant and initial checkout doesn't require
using git's poorly (incompletely) implemented depth parameter.

For a work tree, I'd prefer git for access to a local history though
I do wish for git remote to signal a set of main branches and stop
pulling in zillions of temp and obsolete branches as found in many
projects.

If you use svn+ssh then hopefully ssh is at least gzipping that for the
transfer. A .tgz of the src (after moving the .git repo out of the
directory) is 111 MB.

I used the https url you wrote and doubt it got rewritten to ssh.

It's quite remarkable really that svn uses as much space for a single
uncompressed copy of the source code as git uses for the entire project
history.

And a copy of all sources, too, not just in a bare repository. Necessity
was the force behind the implementation.

​The URL I wrote was for git. That is of course all compressed and packed
into diffs.

I was talking about when you get 800 MB of course code from a svn repo.

[Remembering to re-add llvm-dev this time...]

My primary development platform is Windows,

Make sure you do
    git config core.autocrlf input
to (help) avoid line-ending issues. This isn't in the LLVM
git-related documentation anywhere, that I know of, but it
has been helpful for me (using git mono-repo for a while and
recently moving my upstream development to Windows).

(In particular, finding this setting and forcing fresh checkouts
addressed ~50 lit test failures I had in the Clang tests.)

Have fun!
--paulr

It's possible that GitHub's SVN adapter is different than regular
svnserve, but I didn't think it would and used that git url to
svn checkout.