Proposal: Type-Aware Memory Profiling

Hi cfe-dev,

I’d like to propose a way of Type-Aware Memory Profiling in Clang. It was one of the lightning talks at the DevMtg by Nico Weber on behalf of me. Slides and Video were recently published.

Object’s type information is useful to profile memory usage in a process. Type info is, however, hard to use in C++ programs. Our proposal is to make it easy by compiler’s help. We implemented an early prototype, and the prototype has been working in Chromium project for a couple of months.

What do you think? I wonder if I could have your opinions about it. The details are discussed in this document. https://docs.google.com/document/d/1zyJ3FJhheJ5iSxYaMXARkZcR61tx6r44euMJbWw8jFY/edit

tl;dr, we did it by user-defined intercepting functions for operator new and delete. If a developer defines a function op_new_intercept (and if a compiler option is given!), the function intercepts operator new calls.

An example follows :

$ cat typeaware.cc

#include <stdio.h>
#include

struct Klass {
int x, y, z;
};

void* op_new_intercept(
void* address, std::size_t size, const std::type_info& type) {
printf(“Allocated %lu bytes for %s at %016p.\n”,
size, type.name(), address);
return address;
}

void* op_delete_intercept(
void* address, std::size_t size, const std::type_info& type) {
printf(“Deallocated %lu bytes for %s at %016p.\n”,
size, type.name(), address);
return address;
}

int main() {
int *i;
Klass *k;
i = new int(3);
k = new Klass();
delete k;
delete i;
return 0;
}

$ clang++ -fintercept-allocation-functions typeaware.cc

$ ./a.out

Allocated 4 bytes for i at 0x000000022df010.
Allocated 12 bytes for 5Klass at 0x000000022df030.
Deallocated 12 bytes for 5Klass at 0x000000022df030.
Deallocated 4 bytes for i at 0x000000022df010.

Interesting… Why would you not preserve the context information as well?

Regard,
Ramneek

Thanks for your interest.

Does “the context information” mean like a call site (FILE and LINE)? If so, yes, preserving them may be useful. It might be a good idea if it is committed to the Clang tree. I didn’t implement it in the prototype just because I didn’t use it – we’re using calling stacks instead.

Hello,

Does anyone have additional opinions about it? If none, I think I’ll create a bug entry and start working on it soon.

On Mac, you should hook this into DTrace and let DTrace handle all the
data collection and filtering. This would save you the trouble of
having to relink the browser, and would have basically no slowdown
when disabled (just the overhead of argument shuffling + call + ret
(and even this could be reduced)). That would be really convenient so
that you could e.g. trace the memory allocation patterns of only a
particular webpage, or easily iterate (like, 5 sec. iteration time) an
analysis, filtering/aggregating on type, size, call stack, pid,
thread, or anything else that DTrace can look at + whatever else you
expose to it.

-- Sean Silva

Hi Sean,

Thanks for your suggestion. DTrace sounds reasonable. I may implement the general (described) way at first, but will try DTrace as a next step.

Hi cfe-dev,

I wrote a patch to implement it, and uploaded the patch at http://llvm-reviews.chandlerc.com/D298.

Could anyone tell me a good document about Clang’s test? My patch actually doesn’t have tests yet. I’m not sure how to add new tests in Clang’s style… Finally, I’d like to make sure that https://gist.github.com/4535834 works as its comment.

For this patch CodeGen tests are needed.
test/CodeGen/builtins-multiprecision.c is a good example.

Driver tests are also needed to ensure that the driver passes the flag
to the frontend. test/Driver/retain-comments-from-system-headers.c is
about as much as needed -- just copy it and change the flag.

Sema tests might also be needed, but I can not think of any.

Dmitri

Thanks for your comments, Dmitri.

I just updated the patch and added non-working test files in the patch. I’ll add a document and tests soon.
(It may take some time since I’m a stranger in Clang/LLVM testing.)

Hello, Dmitri and cfe-dev.

I’ve finally added a document and tests in the patch http://llvm-reviews.chandlerc.com/D298. I think it’s almost ready for formal code review!

I wonder if someone could recommend reviewers good for the patch. I’m not sure who are good to ask… It changes around C++ memory allocation especially in

  • AST/DeclCXX, ExprCXX
  • CodeGen/CGCXXABI, CGClass, CGExprCXX
  • Sema/SemaDeclCXX, SemaExprCXX

Thanks in advance.