ld taking too much memory to link clang

Hello;

This is my first time compiling clang; I tried to do it in a computer with 4 GiB of RAM but no swap, and I had a problem: I checked out the latest version of clang and llvm from the SVN repositories; when I ran ‘make’, the computer freezed while linking clang, and I needed to do a hardware reset.

After that, I ran ‘make’ again, this time with a process monitor, and realized that when ‘make’ called ‘ld’ to generate the ‘clang’ executable, the ‘ld’ process allocated memory on and on until the computer freezed.

I searched the Web, and found someone saying that linking is a very memory-hungry process (http://forums.mozillazine.org/viewtopic.php?f=42&t=2300015), and that it’s necessary a big amount of swap memory in order to compile certain projects.

As I never experienced such a behavior when compiling other projects, I decided to ask here: is this really supposed to happen? If so, what is the minimum requirement (in terms of virtual memory) to build clang?

The environment was:
Fedora 18 (3.7.2-204.fc18.x86_64)

clang 3.1

GNU ld version 2.23.51.0.1-3.fc18 20120806

Thanks in advance, and sorry if this is a stupid question.

Yes, linking is very memory hungry. The problem will be exacerbated
significantly if you are building with debug info enabled. I regularly
build a Release+Asserts clang with only 4GB of RAM.

However, I find it strange that Linux's OOM killer isn't killing ld,
and that your computer is instead freezing. You might want to contact
the Fedora developers about that.

-- Sean Silva

I had this issue while using Linux VMs and someone suggested just
using gold instead of ld which resolved my issues. Having some swap
space just in case is also useful.

I also found it strange that ld wasn’t being killed, and was instead causing my computer to freeze. I made a “while (1) malloc(…)” test program, and it gets killed when I run it. Very strange… I’ll contact someone from Fedora about this.

I’ll also try to use gold instead of the traditional ld. Thank you for the suggestion!

Gold in my case is still not helping, since I leave many programs open at once and swap gets crazy in the last linking steps of the debug binaries…

Funny, though, that it starts swapping before it actually hits the maximum memory, and I think that’s why Linux is not killing the process just yet, but it is flogging the usability of the desktop.

–renato

Hi,
The kernel reserves a chunk of memory for itself as working space to dig out
of an out-of-memory situation. So it will swap before running out of memory,
especially if there is a large spike in memory usage at that point where
out-of-memory is approached.

This out-of-memory killer has a long history of perplexing folks, because it
isn't easy to do the right thing in all cases. The kernel might be killing the
wrong processes? Something to verify one way or the other.

enjoy,
Karen

It makes sense for me that the kernel might be killing the wrong process when ld makes the system run out of memory. Searching the Web about the OOM-killer, I found this (from http://linux-mm.org/OOM_Killer):

Any particular process leader may be immunized against the oom killer if the value of its /proc//oom_adj is set to the constant OOM_DISABLE (currently defined as -17).

I’ll try linking clang again and checking if OOM-killing is disabled for ld. If so, I believe this is likely to be the reason for the problem.

I think the main issue is that it's normally ok to compile using all CPUs
(or even 1.5x the number of CPUs) but it's NOT ok to link on the same
cardinality. I've looked for a solution to this and have not found yet a
decent way. Maybe someone here knows...

You could use a script to only start LD once no other was running (or
something like that), so that even at "make -j16", you'd only have a
handful of LDs running. It works, but it's highly hacky.

This is a modern problem, since not often people in 2000 had multi-core
machines on their desks, but nowadays even Pandaboards are SMP. It's not
surprising that Make and others don't have a general solution to this, but
would be an interesting (and very useful) project to allow
Make/SCcons/Ninja to have -jc for compilation and -jl for linking (keeping
-j for both and others).

cheers,
--renato

Hi,
I think your ideas are good. You might want to fully generalize the issue, and
monitor the actual system memory usage and availability in deciding when to
run the linker. You can do that in real-time easily:

cat /proc/meminfo

You can also monitor individual process memory usage through /proc/pid/...,
but the system stats are more appropriate here.

HTH,
Karen

ninja has a "pool" feature which is designed for exactly this purpose.

-- Sean Silva

It's a bit more tricky than it first appears. When linking LLVM, memory usage of ld grows fairly slowly, at about 10MB/s for me (some months ago, when I actually bothered to track it) because it's doing a load of work in between allocations. If you start one linker process, by the time you check again, your build is only using 20MB, so you start another. Check again, and you're using 60MB, so start a third. Check again, now you're using 120MB, still space for a fourth. Then you're at your -j4 limit, so you stop, but these all then grow to 1GB each and you're well over the 2GB that the VM has and you're almost out of swap.

Ideally, the build system would watch the compiler or linker grow and kill some of the processes if the total memory usage grew too high, then restart them, but then the question is which one do you kill? You can take the Linux OOM killer approach, and identify the process with the most unsaved data to kill, or possibly kill the largest one (although this is the one that will have made the most progress, so imposes the greatest time penalty on the build), or the newest one (which may only grow to 50MB). This is a fairly active research area in warehouse-scale computing, as it's the same problem that you encounter with map-reduce / hadoop style jobs.

It's also not obvious what you should count towards the 'used' memory, as some linkers will mmap() all of the object code, which can then be swapped out for free but must go to disk again to swap it back in, slowing things down. On the other hand, some of the accesses through mmap'd files are fairly linear, so swapping them out isn't a problem.

While I'd love to see this solved in a build system, doing it in a sensible way is far from trivial.

David

Wow, ok, that was exactly what I was looking for. I'll probably have to
edit it manually after generating the ninja file from CMake (and re-edit it
after every re-fresh), but...

There is also some logic on limiting resources (not commands) that Karen
mentions, but the documentation on that is not very good yet.

thanks,
--renato

What you need is a learning algorithm, that knows what the consumption will
be. And, as usual, it's generally faster and cheaper to hire an intern to
kill the right jobs for you... :wink:

--renato

Hi David,
Thanks for your comments.

You talk about killing off processes and the consequences. But you are missing the
point: The build process should never drive a system to the point it needs to kill
off processes not associated with the build. Build processes are not mission critical
real-time processes. And monitoring memory provides a simple way to realize the goal
here. And mmapped memory is not an issue at all, if you are using a set of policies
that prevent the system from getting to the critical point of an out-of-memory failure.
Oh sure, the kernel may decide to reclaim such memory more aggressively than other
allocations, but it will do so without loss of data. This is true, as long as the
system is not exhausted of memory.

Some simple observations:

1.) If I don't have enough memory on a system, then I would hope the build process
would self terminate with a log to inform me of the memory shortage.

2.) If there is sufficient memory to proceed slowly, then I would hope the build
process would inform me of the limited memory available in the logs. And then I
hope the system would run in a slow mode. If things deteriorate with an increasing
risk of an out-of-memory crash, then I would hope the build process would, as a
last resort, self terminate rather than crash the computer. No processes should
ever be killed except build processes.

3.) If there are sufficient resources, then I would hope the build process will
make every effort to use them to complete the build as quickly as possible. If
the system memory deteriorates during the build, then I would hope the build
process could slow down to mitigate the resource limit and possibly kill some
of the build processes if needed. In no case, should the build system force the
kernel to kill off other processes.

4.) What if the computer(s) are running other jobs, and these other jobs need
all the resources while a build is in progress. Well, the sensible policy is
for the build process to yield the resources, terminating the build entirely
as a last resort. Why? Because a build process is never real-time critical.
Other processes may have real-time constraints. The operator of the system(s)
has made an error. The build process should yield the resources as a courtesy,
writing logs to inform the operator of the circumstances. And if the other
processes require resources faster than the build process can respond, then
the whole system may crash -- but it won't be due to the build process.

All that is needed is some knowledge about how much memory these systems require
to do the work they are asked to do. Surely those numbers are known. And, if they
are not, then it shouldn't be too hard to acquire them. And then the build process
simply needs to enforce a reasonable policy on memory requirements. The system
memory stats at /proc/meminfo provide the following metrics that should be
sufficient to implement such a system:

MemTotal: Total usable ram (i.e. physical ram minus a few reserved bits and the
kernel binary code)

MemFree: The sum of LowFree+HighFree

SwapTotal: total amount of swap space available

SwapFree: Memory which has been evicted from RAM, and is temporarily on the disk

SwapCached: Memory that once was swapped out, is swapped back in but still also
is in the swapfile (if memory is needed it doesn't need to be swapped out AGAIN
because it is already in the swapfile. This saves I/O)

Active: Memory that has been used more recently and usually not reclaimed unless
absolutely necessary.

Inactive: Memory which has been less recently used. It is more eligible to be
reclaimed for other purposes

I believe those few metrics can provide all the information the build process
needs to implement a sane policy.

Thanks for your comments. Enjoyed reading them.
Karen

I think the main issue is to predict how much of resources will be used beforehand.

I agree with David that OOM shouldn’t be killing processes basic on stupid metrics, but profiling builds might not be a bad idea.

Profile Guided Build Systems could learn from very simple metrics (the ones exposed by Karen) and re-distribute the processes (dependency graph allowing) to use the resources more wisely and stop flogging the machine without letting it be idle either.

External usage (editors, browsers, etc) can be counted as factors into the calculation and regarded as non-build commands that also consume resources and also will have similar usage across builds (a simplification). So, knowing which programs are open can also affect how a build will proceed.

cheers,
–renato

Hello;

I made a test, and “cat /proc/pidof ld/oom_adj” returns 0 when ld is running, which means ld can be killed by the oom-killer. Thus, that is not the cause of the problem.

Maybe the kernel isn’t assigning ld a score high enough to make it be killed in an out-of-memory situation, even when it’s by far the process consuming the highest amount of memory and is ran with a niceness of 19; however and unfortunately, I don’t have time to make further tests to be sure about this.

Anyway, I managed to link clang using gold, so I consider the problem unexplained, but solved.

Thank you very much for the support!