How can I get the MemRegion representing an index of an array in Clang Static Analyzer?

Hi all,

I want to get the MemRegion representing an index of an array in my checker. For example:

1 void func() {

2 int a, b, arr[10][10];

3 a = 2;

4 b = 3;

3 arr[a][1] = 3;

4 }

In the CheckLocation() method for checking the store operation at line 3 ‘arr[a][1] = 3;’, I want to get the MemRegion and SVal of variable a, which is the index of the array. Is there any way to get the corresponding MemRegion and SVal of variable a.

My previous mail post on a related problem is http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-April/036205.html. Unfortunately, there was no reply or solution. In fact, this problem has trapped me for several weeks. So I really need help.

Any help would be greatly appreciated.

Hi, Arthur. I don’t understand what you mean about the MemRegion for ‘a’. ‘a’ is an integer variable, and when it’s used in line 5 (not 3) you’ll just get its value back, which will be 2. The expression “arr[a][1]” should give you an ElementRegion for “&arr[2][1]”, but at that point ‘a’ isn’t involved any more. Once a value is loaded from a variable (by an LValueToRValue implicit conversion), the variable isn’t really interesting anymore. What are you actually trying to do?

Jordan

Hi Jordan,

Actually, I am using the Clang Static Analyzer to do some platform-dependent detection work by developing a checker. The Static Analyzer and my checker are running on an X86-64bit/Linux platform. I’ve set two platform specifications in my checker. So during the evaluation, it can do some platform-dependent detection work by calculation. As a part of my design, given a MemRegion, I need to get its ‘Top Region’ and then to calculate the offset between them. For example:

……

int a = sizeof(long), arr[10][10];

arr[a][3] = 8;

……

For ‘arr[a][3]’, its MemRegion can be represented as ‘&element{element{arr,8 S32b,int [10]},3 S32b,int}’ on an X86-64bit machine. And its Top Region is ‘&arr’. Now I want to calculate the offset as if the code was running on an X86-32bit machine. So ‘&arr[a][3]’ should be ‘&element{element{arr,4 S32b,int [10]},3 S32b,int}’ on that platform, rather than ‘&element{element{arr,8 S32b,int [10]},3 S32b,int}’. In this way, I need to know the SVal for ‘variable a’.

Another related problem in my previous mail post (http://lists.cs.uiuc.edu/pipermail/cfe-dev/2014-April/036205.html) is for pointers. For example:

0 /* example 2 */

1 struct st0 {

2 int i;

3 };

4 struct st1 {

5 int i;

6 struct st0 struct0;

7};

8struct st2 {

9 struct st1 *p;

10 };

11 int main() {

12 struct st1 s1;

13 struct st2 s2;

14 s2.p = &s1;

15 s2.p->struct0.i = 3;

16 }

In fact, the ‘s2.p->struct0.i’ in line 15 should be ‘&s1-> struct0.i’. I want to get the Top Region (&s1) and calculate the offsets between ‘&field_i’ and its Top MemRegion for different platforms. So I tried to use getSuperRegion() repeatedly to get the Top Region starting from the Button MemRegion ‘&field_i’. However, there is a pointer reference along this path. Consequently, if I only use getSuperRegion() all the way, the Top Region will be MemRegion ‘&s2’. Obviously, it isn’t the right Top Region I want. And the right Top Region should be ‘&s1’. So during the upward tracking, if the current MemRegion is a pointer MemRegion, then its pointee MemRegion (the MemRegion which is referred by the pointer) should be achieved. Then I tried to get the pointee MemRegion referred by ‘&s2.p’ via Store (StoreManager.getBinding()). But I got an Undefined SVal. However, the expected SVal should be a MemRegionVal wrapping MemRegion ‘&s1’. So how can I get the pointee MemRegion in such situation?

I’ve been trapped in these problems for weeks. Any help would be greatly appreciated.

Thanks a lot.

Hi Jordan,

For the ‘array index problem’, maybe I have come up with a solution. And I have implemented it and tested it with tiny test cases. The solution is mainly based on the GDM of ProgramState, and it is compatible with my whole design. Regardless of the potential storage problem, it seems that it can work.

However, for the ‘pointer problem’ described in my previous post mails, I have no good idea currently. So could you provide me some solutions or hints? However, I think that I can still use the GDM to track all of the relationships between pointers and their pointees , which is very similar with the solution above.

Hi Jordan,

For the ‘array index problem’, maybe I have come up with a solution. And I have implemented it and tested it with tiny test cases. The solution is mainly based on the GDM of ProgramState, and it is compatible with my whole design. Regardless of the potential storage problem, it seems that it can work.

The RegionStore already tracks relationships between pointers and their pointees. I’m concerned that your solution as written will fill up the GDM and use up much more memory than the analyzer otherwise would, but if it works for you I guess that’s what matters.

However, for the ‘pointer problem’ described in my previous post mails, I have no good idea currently. So could you provide me some solutions or hints? However, I think that I can still use the GDM to track all of the relationships between pointers and their pointees , which is very similar with the solution above.

In fact, the ‘s2.p->struct0.i’ in line 15 should be ‘&s1-> struct0.i’. I want to get the Top Region (&s1) and calculate the offsets between ‘&field_i’ and its Top MemRegion for different platforms. So I tried to use getSuperRegion() repeatedly to get the Top Region starting from the Button MemRegion ‘&field_i’. However, there is a pointer reference along this path. Consequently, if I only use getSuperRegion() all the way, the Top Region will be MemRegion ‘&s2’. Obviously, it isn’t the right Top Region I want. And the right Top Region should be ‘&s1’. So during the upward tracking, if the current MemRegion is a pointer MemRegion, then its pointee MemRegion (the MemRegion which is referred by the pointer) should be achieved. Then I tried to get the pointee MemRegion referred by ‘&s2.p’ via Store (StoreManager.getBinding()). But I got an Undefined SVal. However, the expected SVal should be a MemRegionVal wrapping MemRegion ‘&s1’. So how can I get the pointee MemRegion in such situation?

It doesn’t make sense to use getSuperRegion all the way (or getBaseRegion, which does the recursive work for you), because then you’ve lost which element you are accessing. I think you’d have to delay the getBaseRegion calls until you are actually doing your comparisons. To put it another way, you can always compare a particular ElementRegion or FieldRegion against its base region, rather than trying to look at the region that was referenced by a “similar-looking” expression.

Taking a step back, I don’t think ExprEngine is set up for this much platform independence, since C isn’t set up for this much platform independence. You might be better off with an AST-based check, or something using AST matchers, if you don’t actually need the path-sensitivity. (IIRC AST-based checks can also perform operations on the CFG, which could at least give you flow-sensitivity.)

I don’t feel like I’m being much help here, sorry. Your task stretches the analyzer a bit beyond what it’s well-equipped to do right now.
Jordan