How to track the 'this' pointer when using the clang static analyzer?

Hello everyone!

I want to check that ‘this’ is binded to x, the code is blow:

struct X {
X() {
X* x = this;
}
};

I dumped the exploded graph, the live expression of x is blow:
(0x66ab5b0,0x667a9c0) x : &SymRegion{reg_$0<struct X * this>}

But I’m not sure how to recognize programmatically that this is a symbolic region for a this pointer. I thought that maybe I could use isa<CXXThisRegion>() method, but it turns out that the SymbolicRegion and CXXThisRegion don’t share the same
inheritance chain.

Look forward to your help!

Xin

In the analyzer, CXXThisRegion is representing the cell on the stack that contains the value of the implicit "this" pointer argument during method call. Similarly to how VarRegion for the ParmVarDecl would be the place where an explicit argument is pushed onto the stack during function call.

The actual value stored in this region, however - the value of the implicit argument which is a pointer value that points to "this" object on the heap or wherever it resides - may be quite arbitrary. In general case you cannot say, by looking at that value, that it was taken from CXXThisRegion.

In your case, the method you're looking at is being analyzed "at top frame", which means that the analysis has "started from that method", as opposed to "started elsewhere in some 'foo()' but ended up within this method because this method is being called from that 'foo()'". Because your method is being analyzed at top frame, value of CXXThisRegion has not changed since the beginning of the analysis - in fact, the language doesn't provide any safe way to overwrite this stack region, you are not allowed to compute &this or assign to this in C++. It means that the value of CXXThisRegion is denoted by the special kind of symbol that we use to represent values of regions that have been in these regions since the beginning of the analysis - it's SymbolRegionValue "reg_$0<this>", which contains the pointer to the current CXXThisRegion. The actual this object is therefore known to reside at (symbolic) address "reg_$0<this>". The "this" object is being pointed to by this pointer, and begins at this address. That's pretty much the only thing we know about this region - we're not even sure if the type of the object is "struct X" or any derived structure. Hence the region is represented as SymbolicRegion around reg_$0<this>. I've recently explained more about symbolic regions: http://lists.llvm.org/pipermail/cfe-dev/2017-June/054084.html

However, if the analysis begins at that outside-ish function...

   void foo() {
     X y(); // case 1: calls your constructor
     X *z = new X(); // case 2: calls your constructor again
   }

... then things get different. We have two different CXXThisRegions here. The only difference between them is that they have different parent regions, namely StackArgumentsSpaceRegions that correspond to different stack frames. One stack frame is for the call of the constructor in case 1, another stack frame is for the call of the constructor in case 2. They don't exist simultaneously. The top stack frame doesn't have its own CXXThisRegion because foo() is not a method.

Now, in case 1 the first CXXThisRegion contains a pointer to variable y. The relevant SVal is dumped as "&y". It's a stack variable within StackLocalsSpaceRegion of the top frame. We know a lot about this region, we even know the exact type of the object. That is the region you're looking for.

In case 2 the second CXXThisRegion contains a pointer to the region constructed by operator new(). It may look as &element{SymRegion{conj_$0<X *>}, X, 0 S32b}, which means that the unknown return value of operator new() was denoted by a SymbolConjured "conj_$0<X *>", and the symbolic segment of heap memory within HeapSpaceRegion that begins at pointer conj_$0<X *> indeed contains an object of type X, and this is the region you're looking for.