Minimal build for just clang-format

Folks,

I recently cloned the latest LLVM from github and have managed to build it locally. However, I’m only interested in clang-format and the libraries needed to build it. When I run cmake configured to just build clang (using -DLLVM_ENABLE_PROJECTS=“clang”), the build directory contains ~86GB of built collateral.

Is there a way to further pare down what gets built to be only that which is needed by clang-format?

I probably should limit the targets I’m building with _ LLVM_TARGETS_TO_BUILD_, but I haven’t tried that yet.

Are there any other ways to reduce the amount built given I’m only interested in clang-format? I did make a few small changes to a few files in clang/lib/Format, so I can’t just link to an existing LLVM installation.

Are there any ways to pare down the source tree as well from it’s ~4GB size (again, given I’m just interested in building clang-format)?

Any help would be greatly appreciated!

Thanks,
Bill

ninja clang-format will build clang-format only, but the build will still take up a lot of space since you will be building all of the llvm and clang libraries. Try configuring cmake to do a Release build: -DCMAKE_BUILD_TYPE=Release should cut the build size down significantly.

2 Likes

Thanks @tstellar! I’ll give those and limiting the target machine a try tonight.

OK. Wow, ninja is fast!

It does seem that the space taken up by the build using your approach and limiting the release to X86 is drastically reduced (420MB now). Thanks so much for that guidance!

Would you happen to know if there’s a way to reduce the size of the source tree if, again, I am only interested in clang-format? I know I’m asking to do some odd things with the source tree and build here, but I need to commit the source to a version control system, so eliminating unneeded files would be beneficial.

Thanks again!
Bill

I wouldn’t try to pair down the monorepo that much especially if you want to be able to easily update from upstream at points.

But if you don’t care about that you could delete various top level directories, I don’t know which are interconnected so I think your best bet is to delete or rename one at the time and see which are referenced from llvm/clang.

But again - I don’t recommend this since it will be very hard to update with upstream later.

I believe all the subdirectories other than llvm/ and clang/ should be ok to remove, though this can certainly be a maintenance burden. (And clang+llvm is a lot of mostly-dead code already).

This is a real trade-off of the monorepo and lack of stable APIs in llvm-project.

Thanks for the warnings. Maybe a passable solution is to maintain a full git repo locally and a script to trim that down into another directory that gets put into the version control system.

I’m sure, as you point out, it’s a lot of work for a fragile system.

I’ll see what the ramifications are here of committing that much code to our version control system or if there are other ways the company has of handling such situations.

Thanks to both you and @tstellar for the help!

Thanks for those added details. I’ll see if it’s acceptable to submit the entire repo to our version control system. The alternative of playing “whack-a-directory” is pretty distasteful :slight_smile:.

Thanks!

You can use the git spare-checkout feature to limit the number of files in the source tree. Instructions for how to do this can be found here: Moving LLVM Projects to GitHub — LLVM 16.0.0git documentation

@tstellar when using the -DCMAKE_BUILD_TYPE=Release option, is this as an argument to configure the build eg:

cmake -G Ninja -DCMAKE_BUILD_TYPE=Release ../llvm

Also, when I tried ninja clang-format ninja tells me that is an unknown target:

chris@goldfish build % ninja clang-format
ninja: error: unknown target 'clang-format'

What do you suppose I’m doing wrong here?

Yes, that’s correct.

You need to also add -DLLVM_ENABLE_PROJECTS="clang" to enable building clang.

2 Likes

Ah. That was a useful link for a lot of context.

I tried to do a sparse-checkout with ‘set clang’ (as a start). But running cmake to prep the ninja build died on missing files. So, I guess the whack-a-directory process begins. I went down this rabbit hole a bit and kept having to add directories on the llvm side. I’m not sure if I’m going to try much more to disentangle these threads.

It seems like there’s enough interconnectedness that I won’t end up with a much smaller source tree even if I can get it to work. Unless I’m missing the right way to invoke cmake for what I’m trying to do (very possible).

Again, thanks to all for the help.

You need the llvm, cmake, clang, (and possibly the third_party) directories to build clang, and you need to pass llvm/ as the source directory to cmake.

11	openmp
16	flang
19	lld
32	polly
36	mlir
45	clang-tools-extra
49	compiler-rt
67	libcxx
75	lldb
320	clang
715	llvm

This is the size in MB of the source for each directory, LLVM+clang is gonna be the huge part (most of it is the test directory by the way).

That said the larger data on disk isn’t in the checkout, but in the .git folder I believe:

$ du -sm .git
2359	.git

(edit: it seems that you’re interested to get this into another version control system, you should look into removing the test directory in LLVM as well).

limiting the release to X86

I don’t know if clang-format will break down, but LLVM can be built with no target at all!

Yes. It does build with just the directories you listed (and does need third-party). And thanks for clarifying that I should be building using llvm as the source directory to cmake. I wasn’t sure if there was one right answer.

That reduced things quite a bit.

Thanks!

Yes. Thanks for pointing out my stupidity :-). I actually thought of it before I read your response and had a “Doh!” moment. It’s been a while since I’ve used git and forgot how big that directory could get.

I did actually manage to remove one of the test directories (so far) when I found how big they were, and comment out the add_subdirectory() for the test directory in llvm/CMakeLists.txt. That all seems to build fine now. Thanks for confirming this as a “reasonable” course of action given my situation.

With both test directories removed (llvm and clang), the source tree comes down to a relatively svelte ~250MB.

I tried setting -DLLVM_TARGETS_TO_BUILD= (the empty string) and it built fine. I didn’t really notice any difference in what was installed, so maybe it factors out of the equation when just building clang-format.

I think I’ve gotten to a good point in this endeavor. The source area is a reasonable size and the installation area is very small.

Thanks again (and again) for the help from everyone. Is there a way to close out this discussions? I see the “Solution” check box. But I think I’ve gotten multiple “solutions” as I had more questions. Or do I just say “Thanks” and carry on?

Yeah you can just leave it there if you’re happy with the conclusion :slight_smile:

Thanks! :wink: