Is there any general way to figure out dynamically allocated points?

Hey,

I need to approximate the run-time objects of a program. It is up to
the compiler's front-end to make the decision on how to translate the
new-expressions in C++, however. This obfuscates the way to figure out
dynamically allocated program points in the middle-end (e.g. LLVM).

So I am wondering if there is any general way to do so? Does the
front-end provide any hint(s) to the middle-end to facilitate to do so?

Particularly, can I distinguish the call arising from the
new-expression from various calls if I use llvm-g++ as the front-end?

PS: Under llvm-g++, a new expression is translate into this function
call:
  call i8* @_Znam(..)

Best,
Xiaolong

Xiaolong Tang wrote:

Hey,

I need to approximate the run-time objects of a program. It is up to
the compiler's front-end to make the decision on how to translate the
new-expressions in C++, however. This obfuscates the way to figure out
dynamically allocated program points in the middle-end (e.g. LLVM).

So I am wondering if there is any general way to do so? Does the
front-end provide any hint(s) to the middle-end to facilitate to do so?
  
There probably isn't an ideal way, but there are ways to do it.

First, all of our transforms that look for dynamically allocation sites just "know" what the names of the allocators are. They recognize malloc, realloc, calloc, etc. They also have to recognize the mangled C++ names of the new operator since libstdc++ isn't compiled to LLVM bitcode anymore (or, at least I don't think it is anymore).

Second, there's a function attribute that marks a function as an allocator (I think it's called "allocator" or something like that, but check the LLVM Reference Manual or the GCC manual under __attribute__). Your transform can search for and recognize functions with this attribute; this works well as long as you don't need to know which parameter of the function contains the size, in bytes, of the memory to be allocated.

-- John T.