What clang should do when memory is exhausted?

In http://llvm.org/docs/CodingStandards.html#ci_rtti_exceptions I read
that exceptions are not used by design in clang code base.

Despite this in source there are a lot of unguarded call to new
operator, does this means that when memory is lacking the compiler
executable is meant to fail in arbitrary way?

Here below you can see a typescript showing a simulated example:

$ ( ulimit -v 66000 ; clang -cc1 -ast-dump bzip2.c )
0 _clang 0x000000000247ad66
1 _clang 0x000000000247ab62
2 libc.so.6 0x00007f1c0974e420
3 _clang 0x0000000000b9e33c
4 _clang 0x00000000019cb626
clang::SrcMgr::ContentCache::getBuffer(clang::DiagnosticsEngine&,
clang::SourceManager const&, clang::SourceLocation, bool*) const + 398
5 _clang 0x000000000198d748
6 _clang 0x000000000198bc01
clang::Preprocessor::EnterSourceFile(clang::FileID,
clang::DirectoryLookup const*, clang::SourceLocation) + 291
7 _clang 0x00000000019a4e42
clang::Preprocessor::EnterMainSourceFile() + 166
8 _clang 0x0000000000f14da0 clang::ParseAST(clang::Sema&, bool) + 271
9 _clang 0x0000000000bf3abd
clang::ASTFrontendAction::ExecuteAction() + 265
10 _clang 0x0000000000bf3713 clang::FrontendAction::Execute() + 245
11 _clang 0x0000000000bcd9ff
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 673
12 _clang 0x0000000000ba17c8
clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 957
13 _clang 0x0000000000b9074c cc1_main(char const**, char const**,
char const*, void*) + 999
14 _clang 0x0000000000b9cd8b main + 496
15 libc.so.6 0x00007f1c0973930d __libc_start_main + 237
16 _clang 0x0000000000b8f9c9
Stack dump:
0. Program arguments: clang -cc1 -ast-dump bzip2.c
1. <eof> parser at end of file

A quick grep of the LLVM and Clang source code shows that the “operator new” is often overloaded for the base types of the hierarchies, so most of the new calls in the source are not calls to the global “operator new” provided by the compiler implementation but instead redirect to specialized versions. For example, look into ASTContext.h.

I recall a comment of Doug saying that the custom “operator new” never return 0, so checking is not necessary. It seems that most calls end up into llvm/Support/Allocator.h and there only seem to be some asserts there.

– Matthieu.

I've personally verified (in another debug session) that llvm:DenseMap
uses global operator new and it is not the only.

Furtherly clang uses std::string, std::vector, std::set, etc.

Do you think that the intention is to never use global operator new?

No, I don’t, though I would have expected the usage to be less pervasive :slight_smile:

I suppose your earlier diagnostic was right then, and that an out of memory condition is just supposed to fail badly. It is a rare condition anyway nowadays and there is not much one can do apart from stopping.

– Matthieu

Stopping is fine, but I think that an arbitrary failure is rather rough
and might make debugging very unfriendly.

I'd add that a compiler belong to class of applications that might need
an arbitrary large amount of memory, so the problem is not a theoretical
one...

Seems okay for the compiler proper. For libclang, we'd probably just want to trap to kill the current thread.

  - Doug

I'd leave the responsibility to use set_new_handler to application using
libclang (the clang compiler is not an exception).

The important thing for libclang is to document that using default
compilation flag the library does not propagate operator new exceptions:
the programmer likely is unaware that this might lead to disasters (read
arbitrary execution i.e. in unfortunate cases miscompilation of safety
critical application) when memory is less than needed.

In http://llvm.org/docs/CodingStandards.html#ci_rtti_exceptions I read
that exceptions are not used by design in clang code base.

Despite this in source there are a lot of unguarded call to new
operator, does this means that when memory is lacking the compiler
executable is meant to fail in arbitrary way?

I'd correct myself replacing "fail in arbitrary way" with "execute
arbitrary code".

Can I suggest the use in clang executables of set_new_handler (C++
lib.set.new.handler) to have a graceful exit when memory is exhausted?

Is it an acceptable solution?

Seems okay for the compiler proper. For libclang, we'd probably just want to trap to kill the current thread.

I'd leave the responsibility to use set_new_handler to application using
libclang (the clang compiler is not an exception).

Sure.

The important thing for libclang is to document that using default
compilation flag the library does not propagate operator new exceptions:
the programmer likely is unaware that this might lead to disasters (read
arbitrary execution i.e. in unfortunate cases miscompilation of safety
critical application) when memory is less than needed.

It's good to document this, but a safety-critical application should not be using Clang in-process. There are too many easy ways to crash Clang.

A bit late but...
...isn't that still ugly?
One of the major reasons Xcode 4 was completely unusable on 32 bit machines with C++ projects was because libclang that was running indexing in the background would quickly hog all RAM and simply crash...
I'm in no way fan of blind use of exceptions (and the bloat they add to the binaries) but if there is no other ('easy') way to make clang properly handle out-of-memory errors then..I guess there is no other way..or?

I’m not going to discuss the details of Xcode indexing, but in general there are other solutions to this problem. Throttling back the amount of concurrent work in low-memory situations is another potential direction (e.g., parsing fewer translation units simultaneously in low memory situations). I agree that recovering from low memory situations is also desirable, but trying to avoid it gracefully in a proactive fashion is probably far simpler. If we are in a low memory situation, we are likely going to swap, and performance hits a wall even if we could recover.