[help] How to speed up compilation?

Hi,It’s first time to post question on community, so I’m not sure this is right way to ask.

I’m hoping that someone can help me figure out how to speed up compilation.
I’m working on LLVM 4.0 and modifying clang for my own purpose.

It seems like after LLVM switches its compilation framework to cmake, the compile time is largely increased. It took almost over an hour even though I complies only certain files that I modified.
(Compiles by using ‘make’ command without ‘make clean’ previous compilation)
Sometimes my machine is frozen because of the compilation and dead after couple of hours.

Any suggestion to speed up my compilation methodology?
Appreciate your valuable time.

(Compiles by using 'make' command without 'make clean' previous compilation)
Sometimes my machine is frozen because of the compilation and dead after
couple of hours.

Hi Sung,

That's, unfortunately, expected. If you change a top-level header (one
that's included from too many other files), you can easily trigger a
huge chain of dependencies and re-compile the whole thing.

The freezing is due to extreme memory consumption, most likely from the linker.

Any suggestion to speed up my compilation methodology?

A few hints:

1. Use "ninja" instead of "make", as it's *a lot* smarter in getting
rid of non-existing dependencies.
2. Use CCache. Even if make or ninja can't detect a dependency doesn't
exist, CCache can detect that the file is identical and the command
line too, so it just reuses the cached copy. Very fast rebuilds!
3. Compile using shared libraries, not static linking (CMake's
-DBUILD_SHARED_LIBS=True), as this will reduce re-linking all the
objects into all the executables. This will reduce the number of steps
and freezing.
4. Use a faster linker like "gold" or LLD (if it works for your
target), as they're not just faster but use a lot less memory. This
will help with freezing and speed.
5. If you have low memory, consider using CMake's
-DLLVM_PARALLEL_LINK_JOBS=N, with N = min(amount of RAM in GB, CPU
cores). Some link jobs can take up to 1GB. This will get rid of
freezing, but will make it a bit slower. Adjust the value until it's
fast enough and doesn't freeze.

Hope that helps.

cheers,
--renato

Another one for debug builds on Linux is using split debug info:
-DLLVM_USE_SPLIT_DWARF=ON. This speeds up links dramatically (and
reduces memory consumption) as long as you've got a new enough gdb (I
think lldb is still not quite up to it). It has no effect on macOS
though, because a similar configuration is just how things work there.

Tim.

These are two comprehensive blog posts about how to compile llvm/clang faster:

https://blogs.s-osg.org/an-introduction-to-accelerating-your-build-with-clang/
https://blogs.s-osg.org/a-conclusion-to-accelerating-your-build-with-clang/

  • Matthias

I’ll try those guys! Thanks for advices, all!

Btw, I have a question.

Personally, it feels like compilation become much slower than previous versions after adopting ‘cmake’.

Is this natural when we adopt cmake or are there other big changes on build structure?

I’m not that familiar with huge system like llvm yet, so I want to catch up how professionals design their system.

So, I decided not to respond to that specific part of your original
post because I don't have enough information on what you changed, but
we have deprecated autoconf for a while now, so everyone uses CMake.

If you're comparing LLVM a long time ago with autoconf versus LLVM
today with CMake, then the changes are most likely because LLVM has
grown a lot.

If you're building LLVM trunk today with autoconf, then it's possible
that you're missing a lot of source files from your build (and I'm
surprised it worked).

But overall, CMake should make absolutely no difference in building
speeds, since the number of compilation jobs should (hopefully) be the
same and in the same way. But I may be missing something... :slight_smile:

cheers,
--renato

I’m adding feature to detect customized pragma and mark those region in LLVM IR using Metadata. I want to let programmer give additional directives to compiler.

So I put some functions, variables, and ‘cout’ on clang.

So, based on my understanding on your comment, it may be natural to have long compilation time.

My machine has Intel Xeon(R) CPU E31230 @ 3.20GHz * 8 with 8GM RAM, which is far behind the recommended build environment ( Intel Core i7-4770K CPU @ 3.50Hz, 16 GM RAM, and a 1TB 7200RPM HDD or SSD, ref : https://blogs.s-osg.org/an-introduction-to-accelerating-your-build-with-clang/ )

Do you think it would be great help if I upgrade my machine?

I’ve wanted to upgrade it at some point, but I haven’t find right excuse to tell my boss. haha

Thank you so much!

If you are doing a debug build and hit swap, the cmake-based Makefiles
are a lot more aggressive about concurrent jobs. In that case, the
swapping can make the system a lot slower overall.

Joerg

My machine has Intel Xeon(R) CPU E31230 @ 3.20GHz * 8 with 8GM RAM

I have a Xeon X3450 * 8, which seems less powerful than yours, and it
does a full build + check-all in less than 1h.

which is far behind the recommended build environment ( Intel Core i7-4770K CPU @
3.50Hz, 16 GM RAM, and a 1TB 7200RPM HDD or SSD, ref :
https://blogs.s-osg.org/an-introduction-to-accelerating-your-build-with-clang/)

That's pretty much my config, 4-core (8-threads), and I builds cold
and check-all in ~30min.

But I think the trick is to not update the sources every time.

If you're working on your branch, keep it stable and just rebuild for
development.

Then, once in a while (~every week), you rebase with trunk and do a
full build while you have some tea.

Then go back to a stable branch, with only your changes for the rest
of the week.

cheers,
--renato

Right, I always override the build/link jobs (to N cores / RAM)
because of how ninja trashes the machine. When I run on a build
server, I don't mind much, but on my laptop (or ARM boards)... :slight_smile:

cheers,
--renato

cmake should be a lot faster, especially using ninja instead of makefiles, because it can run the maximum number of jobs to keep all the cores busy more of the time.

The problem looks like with eight cores (and therefore 8 parallel jobs) and only 8 GB of RAM, you don’t have enough RAM for the number of cores, so you’ll be using swap a lot. That’s mostly true for linking. c++ build steps don’t use all that much RAM.

Thank you, all!

I switch to use ninja with gold and shared library, and also put the memory limitation.

Now the compilation is as nimble as Ninja!!

Full-build(build after clean) now takes about 40 min, which is MUCH better than freezing!

Awesome!

Again, appreciate all your helps!

IIRC, ccache only learned about split dwarf fairly recently and the version on Debian stable didn't know about it. I think that the result was that ccache always gave up and called the real compiler. I remember split dwarf being the bigger win of the two but I haven't kept any numbers on it.