libclang: Memory management


I am a little bit puzzled about the memory management in libclang. What I observed is, when run on mid-sized code base, a very high (understandable) memory consumption whereas it seems it is not possible to reclaim all the consumed memory back once we are finished with whatever we have been doing with libclang. For example, this is a code excerpt which should explain my words:

// 1. Create an index
vector tunits;
CXIndex idx = clang_createIndex(1, 1);

// 2. Parse each file found in directory and store corresponding TU into the vector
for (auto file& : directory) {
CXTranslationUnit tu;
if (!clang_parseTranslationUnit2(idx, file.path().c_str(), cmd_args, cmd_args_len, 0, 0, CXTranslationUnit_DetailedPreprocessingRecord, &tu)

// 3. Cleanup
for (auto tu& : tunits) {

If I run this code on cppcheck code base ( which is somewhat a mid-sized project (it counts cca 300 C/C++ files), I get the following figures in terms of memory consumption for that particular process (based on htop output):

  • app memory consumption after 2nd step: virt(5763M) / res(5533M) / shr(380M)
  • app memory consumption after 3rd step: virt(4423M) / res(4288M) / shr(65188)

So, as it can be seen from the figures, even after the cleanup stage, both virtual and resident memory figures are still very high. Seems like the only part which has been reclaimed is the memory that has been associated with the TU’s. All the other, I can guess, parsing artifacts are still being hold somewhere in the memory to which we don’t have access to neither we can flush them using the libclang API. I even ran the code with valgrind and there were no memory leaks detected (only still reachable blocks).

Either I am missing something here or this might impose a memory issues for long-running non-single-shot tools (i.e. think indexer).

Can anyone comment on this issue?


Benjamin had tried to come up with a repro at some point, too, iirc

Reproducing the issue is fairly easy. If you like I could provide a smallish demo which monitors the (system) memory consumption at various steps of execution?

Better question is if there is anything we can do to improve this? I believe this is a highly important aspect of the library.

Reproducing the issue is fairly easy. If you like I could provide a
smallish demo which monitors the (system) memory consumption at various
steps of execution?


I am interested in running the demo and investigating the issue.

Better question is if there is anything we can do to improve this? I
believe this is a highly important aspect of the library.

If it's caused by data that we don't need but that's still hanging around
then yes.


Is this not a general case of “heaps (sometimes/often) don’t shrink once they have grown”? In other words, once the OS or runtime has used a large amount of memory, unless there are other components in the system to “push out” those memory allocations to swap, the memory now freed is still owned by the process that originally allocated it. It’s a bit more complex as well, since both Linux and Windows have multiple ways of allocating memory, with choices made both by the programmer (picking different APIs) and the runtime/OS itself. The memory that has been released is (assuming there’s no real leak) free to be used for other purposes.

The reason for this behaviour is that applications quite often build up large sets of heap allocations, release said allocations, and then allocate large amounts again.

You can try this out by doing something like:

int main()

int *p = new int[10000000];

… fill p with data to actually use the memory …

delete p;

cout << “Memory freed, now sleeping…” << std::endl;


return 0;

If when it’s freed memory and sleeping, the memory usag is still some 400MB, you know that the heap is not immediately released to the OS.

Mats, I think it actually makes sense what you’re saying and it might be very well that we are experiencing exactly this case. It shouldn’t be too hard to check it out though. If it proves right, then allocating additional memory (like in your example) after we finish with the libclang acitivites (disposal), memory consumption should not increase. Expected is that the memory owned by the process will be reused. I will make an experiment and come back with the results.


That doesn’t necessarily hold. A typical modern memory allocator will have different pools for different-sized allocations and will provide a hint to the OS about whether the physical memory backing virtual regions can be reclaimed. If you allocate with a different size to the ones that libclang has used, then you may end up with new zones being allocated from the OS for the memory allocator to reuse and if memory pressure is low then the OS may not recycle the physical pages that are backing free’d allocations (and if there is a lot of fragmentation, then you may have a very small number of allocations holding a lot of pages together).


Alright, I am really not familiar with all of the implementation details of memory allocators but I was aware that it is not that straightforward as my words probably were telling.

Anyhow, with this in mind what both David and Mats said, I have made two different experiments:

  1. Do the libclang stuff (create index, parse multiple files, cleanup) and then allocate 100MB on heap
  2. Do the libclang stuff (create index, parse multiple files, cleanup) two times in a row

Results of first experiment were:

Before libclang cleanup: VIRT: 365M RES: 227M SHR: 75M
After libclang cleanup: VIRT: 277M RES: 166M SHR: 63M
Allocated another 100MB: VIRT: 377M RES: 265M SHR: 63M
Deallocated 100MB: VIRT: 277M RES: 166M SHR: 63M

Results of second experiment were:

Before 1st libclang cleanup: VIRT: 365M RES: 227M SHR: 75M
After 1st libclang cleanup: VIRT: 277M RES: 166M SHR: 63M
Before 2nd libclang cleanup: VIRT: 354M RES: 228M SHR: 76M

After 2nd libclang cleanup: VIRT: 277M RES: 173M SHR: 63M

Additional allocations were done as:

char p = new char[1001024*1024]; // 100MB

for (int idx = 0; idx < 10010241024; idx+=4096) p[idx] = idx; // touch pages to force them go into the resident memory

So it looks like my assumptions were proved wrong (probably due to what David said). What I can see as a conclusion is that physical (resident) memory which is still owned by the process will be able to get (partially) recycled if we utilize it in a similar way we did in previous iterations (i.e. second experiment). Otherwise, physical (resident) memory consumption will increase (i.e. first experiment).

Like I said (and David points out), the rules for when something is reused, and when it isn’t, depends A LOT on how that memory was allocated (and the details of the allocator implementation itself). If we assume that you are using a normal malloc (or the new operator), such as it is normally done in Linux (e.g. “glibc”), then I happen to know that an allocation that is considered “large”, (from memory, large means anything more than 128KB, but could be some other number), the fresh memory required for the allocation is not done through “sbrk” (growing the heap), but from mmap (using a null file). This “large” allocation is also released “immediately”.

I bet that if you redo that code to for example use a few million small allocations, it will not grow the original size (but again, it all depends on what things are in the free-list, and how that matches what you are currently allocating.

The way that compilers and parsers work, you quite often create small-ish lists of things, such as argument lists, statements in blocks, all the functions, etc, so there will be lots of small and intermediate size allocations - and some big ones in some places.

So I think we can conclude that there isn’t any BIG leaks at least… :slight_smile:

Yes, I think there’s not much that can be done from libclang POV. This is all memory allocator stuff and should be handled from client code if required but nevertheless interesting observation we should be aware of when building tools.

Thanks everybody for the fruitful discussion and details provided.


BTW: If you are unsatisfied with the default behavior of the memory allocator,
you can try to use malloc_trim:


Interesting. I wasn’t familiar with this API. But it looks exactly like what might be a way to tackle this issue.

I suppose one should be careful about using it in order not to hurt performance (of follow-up allocations), but if one knows that specific use-case:

  1. Will consume a disproportionaly large amount of memory
  2. Will allocate memory in a way (chunk-wise) that reusing it will be less probable with other use-cases
  3. Is not recurring
    Then I would say that it is safe to use this method to swap the memory back to the OS.