Clang 3.6 and trunk, high RSS usage compared to GCC (12.5GB vs. 0.5GB)


I found that after moving to Clang pre-3.6 (git
65d8b4c4998b3a0c20934ea72ede72ef4838a004) and trunk (git
718825a8666acd9ceaab70fc7868332f20e2758f) our internal build machines started
going offline in Jenkins. Clang after 3.5 release is consuming extreme amounts
of memory in some cases.

I have uploaded [1] one of affected files.

$ g++ -std=c++11 -c -O1 -fPIC vpp_generated.ii -o vpp_generated.o

vmpeak: 582432 KB
rspeak: 504500 KB = ~ 0.5GB

$ clang++ -std=c++11 -c -O1 -fPIC vpp_generated.ii -o vpp_generated.o

vmpeak: 12992076 KB
rspeak: 12820184 KB = ~12.5GB

Disabling optimzer (-O0) resolves the issue, and RSS usage drops to ~300MB.

I decided to write here directly instead of creating yet another bug report,
which usually don't get any feedback/comments.

clang version 3.7.0 (git 718825a8666acd9ceaab70fc7868332f20e2758f)
Target: x86_64-unknown-linux-gnu
Thread model: posix


../configure --prefix=<..> --enable-optimized --with-binutils-include=<..>
--disable-terminfo --enable-bindings=none CC=gcc CXX=g++ 'CPP=gcc -E'
'CXXCPP=g++ -E'

- - -

Are you missing --disable-assertions?

Is the problem also apparent on the 3.6 branch or with the 3.6 RC1 build?


I could reproduce this locally with clang 3.5.1 from the ArchLinux packages. I
ran it through my heaptrack [1] tool until ~630MB peak memory where consumed,
which should give enough of an insight already. The full report can be found
at [2]. Here is an excerpt:

7137759 calls with 475.81MB peak consumption from:
      in /usr/bin/../lib/
    llvm::LazyValueInfo::getConstant(llvm::Value*, llvm::BasicBlock*)
      in /usr/bin/../lib/
      in /usr/bin/../lib/
      in /usr/bin/../lib/
      in /usr/bin/../lib/
      in /usr/bin/../lib/
    clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions
const&, clang::TargetOptions const&, clang::LangOptions const&,
llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::raw_ostream*)
      in /usr/bin/clang
      in /usr/bin/clang
    clang::ParseAST(clang::Sema&, bool, bool)
      in /usr/bin/clang
      in /usr/bin/clang
      in /usr/bin/clang
      in /usr/bin/clang
      in /usr/bin/clang
    cc1_main(char const**, char const**, char const*, void*)
      in /usr/bin/clang
      in /usr/bin/clang




It seems that we always disabled optimizer on some files for Clang and
I dropped this particular patch from our code base.

I tried to bisect Clang and went as far as 3.3 LLVM/Clang release with
the same memory consumption.

I moved to Fedora 21 based VM. GCC 4.9.2 used <500MB, and Clang 3.3 till
trunk above 10GB until VM hanged.

We didn't use --disable-assertions. Clang still has a number of issue
with our code base (incl. crashing/asserting). pre-3.6 Clang improves
(no more aborting), but some issues remain. For those I filled bug reports.
Yet to test the current 3.6 RC1 with our code base to see if it regressed
or not.


I'd like to add that I was having similar issues with Clang 3.5, while
compiling test-code inside the QtCreator code base, with optimizer enabled.

I'm not sure if it's known or if I should file a bug report for that.

Relevant issue: Attempting to compile the tst_dumpers.cpp [1] target, with
optimizing flags enabled (-O2, iirc). => Clang runs OOM (with more than 5 GB
Fix: Don't pass -On --

Note: tst_Dumpers::dumper() is a 5 KLOC function, which *might* be an issue :slight_smile:


I have seen a number of cases in pkgsrc, where the Value Propagation
pass results in a very high CPU and memory use. Attached is a hack that
allows disabling the pass via "-mllvm -disable-value-propagation". It
seems you are hitting the same issue.


prop.diff (2.19 KB)

We also looked into it.

This seems to be coming from LLVM, not Clang code. We did a heap profile
with IgProf [1] in a middle of compilation.

In our case seems that all allocations are from here:

lookup(Val)[BB] is constantly allocating std::map<AssertingVH<BasicBlock>,

(104 bytes), that slowly eats the memory.

Current impression is that cache is exploding.



I reduced -O1 passed to the following:

opt -lazy-value-info -correlated-propagation -reassociate -loops -lcssa
-loop-rotate -licm -loop-unswitch -scalar-evolution -loop-deletion
-function_tti -loop-unroll -memdep -memcpyopt -sccp -lazy-value-info
-correlated-propagation -memdep -dse -adce a.bc > /dev/null

Seems that removing or re-aranging the passes removes the problem.