How to describe a pointer that points to All memory(include global memory, heap, stack)?

Hello, could any one point me following question.

e.g:
  void foo(int * p)
  {
        *p = 0;
  }
        Here 'p' may point to all memory location.
        Could you tell me how to represent the POINT TO set of 'p'?
      
        Here is my solution:
        Introduce a memory-class named: Global_Mem, then p pointed to global_mem.
        And the MOD set of '*p=0' is Global_Mem.

But how to present the overlapping alias set:
e.g2:
        extern A[100];
  void foo(int * p, int i)
  {
           *p = 0;
           A[i] = 10;
  }

'p' may point to anywhere. So p may point to A. How to describe the relation?

Hello, could any one point me following question.

Without any context, your question is difficult to answer. Are you building a points-to analysis and wanting to know how an alias analysis might encode the fact that a pointer could alias any other pointer?

-- John T.

Hi, John
        I am building a flow sensitive intra-procedural alias analysis(without interprocedural info).
  So, the first thing I have to consider is where a parameter-pointer or a global-pointer might point to.
  Then I defined several special Virtual Memory Locations: ALL_MEMORY, GLOBAL_MEMORY, STACK_MEMORY, HEAP_MEMORY. ALL_MEMORY contains GLOBAL_MEMORY, STACK_MEMORY, HEAP_MEMORY.
  
  e.g1:
  extern int * q;
  void f(int * p)
  {
      int a;
      *p = 0; //STMT1
      *q = a; //STMT2
  }
  For above case, both p and q pointed to ALL_MEMORY: p->ALL_MEMORY, q->ALL_MEMORY.
  And each STMT has two BitVectors to describe MayDef, MayUse.
  (I think the BitVector must be sparse, otherwise the alias-analysis module need too much memory to allocate for all BitVectors.)
  For STMT1, the MayDef={ALL_MEMORY}, MayUse={}
  For STMT2, the MayDef={ALL_MEMORY}, MayUse={'a'}

Hi, John
        I am building a flow sensitive intra-procedural alias analysis(without interprocedural info).
        So, the first thing I have to consider is where a parameter-pointer or a global-pointer might point to.
        Then I defined several special Virtual Memory Locations: ALL_MEMORY, GLOBAL_MEMORY, STACK_MEMORY, HEAP_MEMORY. ALL_MEMORY contains GLOBAL_MEMORY, STACK_MEMORY, HEAP_MEMORY.

Contains or points to?

ALL_MEMORY should point to ALL_MEMORY, otherwise *ALL_MEMORY !=
ALL_MEMORY, which will break things like linked lists.

In any case, the final representation in GCC of points to anything is
a single bit flag.
We don't keep sparse bit vectors around just to have a single bit set in them.

        e.g1:
        extern int * q;
        void f(int * p)
        {
            int a;
            *p = 0; //STMT1
            *q = a; //STMT2
        }
        For above case, both p and q pointed to ALL_MEMORY: p->ALL_MEMORY, q->ALL_MEMORY.
        And each STMT has two BitVectors to describe MayDef, MayUse.
        (I think the BitVector must be sparse, otherwise the alias-analysis module need too much memory to allocate for all BitVectors.)

Well, actually, this is what led to bdd representations.

GCC uses sparse bitmaps, which work fine for intraprocedural use.
We do a large amount of presolving to discover equivalences before
ever filling in points-to sets.

I'm not sure your mechanism of solving, but if you are constraint
solving, you should special case ALL_MEMORY (and a few others).
There is never a reason to add anything to a points-to set that
contains ALL_MEMORY.

Hi, Daniel, thank you for your advice.
Yes, ALL_MEMORY points to ALL_MEMORY.

We use MD(memory descriptor) to abstract a memory location.
MD contains 4 main fields: id, base, offset, size.
For these special MD (ALL_MEMORY, GLOBAL_MEMORY, STACK_MEMORY, HEAP_MEMORY),
we give them id 1, 2, 3, 4, that means MD1 is ALL_MEMORY, MD2 is GLOBAL_MEMORY, the same goes for the rest.
Then we maintain a BITSET class to encapsulate the 'union', 'intersect', 'diff' etc to simply the operations bewteen special
MD and other general MDs.
  e.g: union operation bewteen special MD and general MD.
  
    Given BITSET a, b;
    a={MD1}
    b={MD10}
    c={MD20}
    
    Here MD10, MD20 are general MD. They described the actually variable.
    
    b.union(c) equal to {MD10, MD20}
    a.union(b) equal to {MD1}
    b.union(a) equal to {MD1}

Our flow-sensitive method is divided into 2 step:
  1. Iterative solve flow-equation to compute POINT-TO for each stmt.
  2. According to POINT-TO info, compute MayDef BITSET and MayUse BITSET for each stmt.

So, I have an question, after step 2 each stmt should own two BITSETs to record MayDef and MayUse.
Is that right or necessary?
How to describe both MayDef, MayUse info if just have a single BITSET in each stmt?

--- 13年3月21日,周四, Daniel Berlin <dberlin@dberlin.org> 写道:

Hi, Daniel, thank you for your advice.
Yes, ALL_MEMORY points to ALL_MEMORY.

We use MD(memory descriptor) to abstract a memory location.
MD contains 4 main fields: id, base, offset, size.
For these special MD (ALL_MEMORY, GLOBAL_MEMORY, STACK_MEMORY, HEAP_MEMORY),
we give them id 1, 2, 3, 4, that means MD1 is ALL_MEMORY, MD2 is GLOBAL_MEMORY, the same goes for the rest.
Then we maintain a BITSET class to encapsulate the 'union', 'intersect', 'diff' etc to simply the operations bewteen special
MD and other general MDs.

Okay.

        e.g: union operation bewteen special MD and general MD.

                Given BITSET a, b;
                a={MD1}
                b={MD10}
                c={MD20}

                Here MD10, MD20 are general MD. They described the actually variable.

                b.union(c) equal to {MD10, MD20}
                a.union(b) equal to {MD1}
                b.union(a) equal to {MD1}

Our flow-sensitive method is divided into 2 step:
        1. Iterative solve flow-equation to compute POINT-TO for each stmt.
        2. According to POINT-TO info, compute MayDef BITSET and MayUse BITSET for each stmt.

So, I have an question, after step 2 each stmt should own two BITSETs to record MayDef and MayUse.
Is that right or necessary?

It's fine, it's not strictly necessary, but it's not bad or harmful.

How to describe both MayDef, MayUse info if just have a single BITSET in each stmt?

Cut the universe in half. Assuming 64 bit ids:

bits 1-9223372036854775808 represent mayuses
bits 9223372036854775809-18446744073709551616 represent maydefs

(If your bitsets are sparse, and allow negative indexes, like gcc's,
you could just use positive and negative numbers).

This would need to be abstracted properly so that the
union/intersect/diff/iterators operated only on the proper portions of
the bitset.

I'm not sure why you'd want to do this as opposed to keeping two
bitsets, using bdds, or using sparse bitmaps.

Daniel,
            

I'm not sure why you'd want to do this as opposed to keeping two bitsets, using bdds, or using sparse bitmaps.
            
It looks like your method is more concisely, using one bitset.
And as you said, if our two bitsets method can also make alias analysis and DU analysis work well, and do not harm compiler performance,
we choose to keep the current design method, then put more attention
into other work. : )

--- 13年3月22日,周五, Daniel Berlin <dberlin@dberlin.org> 写道: