Std::vector vs. llvm::SmallVector - why does one leak?

I was rewriting some code to use llvm::BumpPtrAllocator and std::vector, and once I’d finished MSAN began reporting a memory leak. I tried using llvm::SmallVector, and the leak went away. I’m not very good at C++, so I was hoping I could get help understanding why.

Originally, I had classes like this:

struct MyObject {};
struct MyContainer {
  std::vector<std::unique_ptr<MyObject>> objects;
  MyContainer(std::vector<std::unique_ptr<MyObject>> &objects)
    : objects(std::move(objects)) {}

This worked great, no leaks – but my header files became unwieldy, because the std::unique_ptr above can’t be used with a forward declaration of MyObject.

So to avoid this problem, I tried allocating my objects using llvm::BumpPtrAllocator. I wrote a simple test case for this:

struct MyVector {
  std::vector<MyObject *> objects;
  MyVector(std::vector<MyObject *> &objects)
      : objects(std::move(objects)) {}

TEST(AllocatorTest, Vector) {
  llvm::BumpPtrAllocator allocator;

  MyObject *myObject = new (allocator.Allocate<MyObject>()) MyObject();
  std::vector<MyObject *> objects;

  MyVector *vector = new (allocator.Allocate<MyVector>()) MyVector(objects);

  // Destructor must manually be called here, otherwise LeakSanitizer reports:
  //   Direct leak of 8 byte(s) in 1 object(s) allocated from:
  //     #0 0x4f6488 in operator new(unsigned long) llvm-project/compiler-rt/lib/asan/
  //     #1 0x5006a3 in void std::vector<MyObject*, std::allocator<MyObject*> >::_M_realloc_insert<MyObject*>(__gnu_cxx::__normal_iterator<MyObject**, std::vector<MyObject*, std::allocator<MyObject*> > >, MyObject*&&) /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/vector.tcc:427:33
  //     #2 0x4f96e6 in AllocatorTest_Vector_Test::TestBody() examples/unittests/llvm/Support/AllocatorTest.cpp:45:11

I understand that one of the downsides of using llvm::BumpPtrAllocator is that destructors aren’t called when the allocator’s slabs are deallocated. And I guess I understand how std::vector::push_back would allocate 8 bytes of memory to store a pointer, MyObject *. However, I don’t quite understand why LeakSanitizer is perfectly happy with this:

struct MySmallVector {
  llvm::SmallVector<MyObject *, 2> objects;
  MySmallVector(llvm::SmallVectorImpl<MyObject *> &objects)
      : objects(std::move(objects)) {}

TEST(AllocatorTest, SmallVector) {
  llvm::BumpPtrAllocator allocator;

  MyObject *object = new (allocator.Allocate<MyObject>()) MyObject();
  llvm::SmallVector<MyObject *, 2> objects;

  MySmallVector *smallVector =
      new (allocator.Allocate<MySmallVector>()) MySmallVector(objects);
#pragma unused(smallVector)

My plan is to keep reading through the llvm::SmallVector internals to understand why this doesn’t result in the same memory leak of 8 bytes to store MyObject *, but so far from what I can see, the llvm::SmallVector destructor would have to be called for it to free the slabs that store those 8 bytes, and I don’t see how that would be called if MySmallVector’s default destructor isn’t called.

Does anyone know why my last example with llvm::SmallVector doesn’t leak memory, but my second-to-last example with std::vector does? (Apologies in advance if this is just a simple C++ question that doesn’t have much to do with LLVM.)

Are you sure you want to use BumpPtrAllocator? It’s hard to use.

In your example you won’t be seeing a leak if you stay under the inline capacity of a SmallVector. All of the memory is contained in the BumpPtrAllocator and it frees up the memory on destruction, but doesn’t destroy anything contained in it. In you case you’ll see a leak if you add more than 2 pointers to the SmallVector.

std::vector always allocates memory, so you’ll get a memory leak if there’s 1 or more elements in the vector.

1 Like

In your example you won’t be seeing a leak if you stay under the inline capacity of a SmallVector.

I see, thank you! Wow, I’m glad I asked, I would not have guessed that. Probably I’d end up seeing leaks in my program (not as simple as the examples) with larger inputs. That would have been confusing.

Are you sure you want to use BumpPtrAllocator? It’s hard to use.

One reason I wanted to use it was to learn more about it and its pitfalls, so it’s paying off so far :stuck_out_tongue_closed_eyes:

But now that you mention it, I wonder what my alternatives are…

  • I could manually call operator delete within my program’s destructors, I suppose. Very simple, but I imagine prone to bugs.
  • I could also try using llvm::SpecificBumpPtrAllocator, which does call destructors – although I do in fact allocate many different types of objects, so I’d need many of these.
  • std::shared_ptr<T> seems to have the same drawback as std::unique_ptr<T> – if I use them as members of my classes, those classes will need to pull in the entire definition of T. This wouldn’t work for, say, Clang, because clang::DeclStmt requires clang::Decl, and Decl.h defines clang::FunctionDecl which has a clang::Stmt member. Without forward declarations, Clang would need to split up its headers and make sure they’re not included in the wrong order (as far as I can tell).

You can use incomplete types with std::unique_ptr, however, you should specify MyContainer 's destructor explicitly in .cpp file (where MyObject is already a complete type).

1 Like

Thanks all! This is super helpful. I’m currently trying to use std::unique_ptr and separating out an explicit destructor from the class’s declaration.

One question: what could I change about by original post’s AllocatorTest.SmallVector test to prevent it from leaking memory? Is explicitly calling the ~MySmallVector destructor the correct thing to do?