Find mallocs from a DSNode

Is there some efficient way to find the malloc instructions which point
to the memory represented by a DSNode? Right now the only way I see is
by iterating through the value map in the DSGraph object and finding
every value that points to my DSNode.

A possibly related question: is there a way to tell that the node has
been merged, and if so, what types of nodes went into the merge?

Thanks,
Scott Mikula

Is there some efficient way to find the malloc instructions which point
to the memory represented by a DSNode? Right now the only way I see is
by iterating through the value map in the DSGraph object and finding
every value that points to my DSNode.

You're right, you have to iterate over the program, looking to see what
DSNodes the Malloc instructions point to. This is intentional: keeping
track of this information would cause a huge representational problem.
Sorry :slight_smile:

A possibly related question: is there a way to tell that the node has
been merged, and if so, what types of nodes went into the merge?

Yup, DSNode::NodeType contains flags for the node. There are four flags
that indicate the composition of the node: DSNode::AllocaNode, HeapNode,
GlobalNode, UnknownNode.

When two nodes are merged, their Flags are bitwise or'd together...

-Chris

Chris Lattner wrote:

> Is there some efficient way to find the malloc instructions which point
> to the memory represented by a DSNode? Right now the only way I see is
> by iterating through the value map in the DSGraph object and finding
> every value that points to my DSNode.

You're right, you have to iterate over the program, looking to see what
DSNodes the Malloc instructions point to. This is intentional: keeping
track of this information would cause a huge representational problem.
Sorry :slight_smile:

Fair enough.

> A possibly related question: is there a way to tell that the node has
> been merged, and if so, what types of nodes went into the merge?

Yup, DSNode::NodeType contains flags for the node. There are four flags
that indicate the composition of the node: DSNode::AllocaNode, HeapNode,
GlobalNode, UnknownNode.

When two nodes are merged, their Flags are bitwise or'd together...

I realized that, but what if two nodes of the same type are merged? Their
flags will not indicate that a merge has occurred. I'm concerned about the
case where two malloc nodes are merged, because then if free is called on the
memory represented by that node, you cannot be certain of either malloc's
memory being freed. I suppose after I iterate over the value map I'll see
that two malloc instructions point to the same DSNode, which will give me the
equivalent information.

Thanks,
Scott

> When two nodes are merged, their Flags are bitwise or'd together...

I realized that, but what if two nodes of the same type are merged? Their
flags will not indicate that a merge has occurred. I'm concerned about the
case where two malloc nodes are merged, because then if free is called on the
memory represented by that node, you cannot be certain of either malloc's
memory being freed. I suppose after I iterate over the value map I'll see
that two malloc instructions point to the same DSNode, which will give me the
equivalent information.

Pretty much. At some point, any representation has to merge together
information, because (in general) the program being represented may create
an unbounded number of objects dynamically, and we can obviously only
represent a finite number of them. Looping through the program to see how
many scalars are pointing to the same node seems like a reasonable way to
get the information you need, but remember you also have this case:

for (...)
  X = malloc ...

where _1_ call site can create an unbounded number of objects, all of
which may be potentially merged together...

-Chris

Scott (and the other Project 6 people),

I had given this some thought and, interestingly enough, there are cases
where you *can* be sure a node is freed even if 2 nodes have been merged
before the free. E.g.,

  if (c)
     p1 = malloc...
  else
     p2 = malloc...
  endif
  p3 = phi(p1, p2)
  ...
  free p3

The question is, can you distinguish the above case from the following one
where there is a potential leak:
      p1 = malloc...
  if (c)
     p2 = malloc...
  endif
  p3 = phi(p1, p2)
  ...
  free p3

Of course, some cases (like the loop Chris pointed out below) is obviously
difficult to detect. The frees will usually happen in some other loop, and
it requires different technology to know if the free loop enumerates a
subset, superset, or identical set of array elements compared with the first
loop.

--Vikram