How can I get and manipulate the Store and Environment in the current ProgramState in Clang Static Analyzer?

Hi all,

I want to get and manipulate the Store and Environment in the current ProgramState in checkPreStmt() and checkPostStmt(). How can I do that?

What’s more, I’ve dumped the ProgramState in my checker in checkPostStmt(). But I don’t know the meaning of addresses in the Expressions. What do those addresses mean? Thanks a lot. Any help will be greatly appreciated.

Best regards,
Arthur Yoo

Hi, Arthur. The Store and Environment are not manipulated directly. Rather, you use the methods of ProgramState to access them.

…and that said, checkers should hardly ever need to modify the Store and Environment. Most of the time, you should be storing your state in the generic data map (the set<> and get<> methods of ProgramState). Modifying the Environment arbitrarily can break invariants about what does and doesn’t have a value. The Store is a little safer—performing simple bindings and providing default values is usually safe for checkers. Just be careful about types.

The keys in the environment are a tuple of the statement and the location context. The location context is what allows us to handle recursive functions without the callee’s bindings stomping on the caller’s. We don’t have a great way to print a location context, so we just use the pointer value. (Perhaps a stack depth would be helpful?)

Hope that helps,
Jordan.

P.S. I can’t remember if we’ve suggested this to you already, but there are some reference materials at http://clang-analyzer.llvm.org/checker_dev_manual.html. In the first paragraph there’s also a link to the talk Anna Zaks and I gave at last year’s LLVM Developers’ Meeting, which runs through how to build a simple checker and use the generic data map.

Jordan:

It’s so kind of you to answer my questions. If I get a SVal variable, how can I modify its value? For an example, I have a SVal. With dump(), it shows ‘2 S32b’. I want to change its value to 3. So how can I do that? Thank you.

Arthur

SVals are values; you can’t change them. It’d be like saying x is 2, and now you want to change 2 into 3, when you really want to assign 3 to x.

What are you actually trying to do here?
Jordan

Hi Jordan,

I created a new SVal with the SValBuilder and then replaced the old SVal with the new one.

Now I have another question about MemRegion. Can I get the offset of a SubRegion to its SuperRegion? For an example, here is a struct object:
struct LayerOne {
int a;
char b;
int c;
} obj;
It’s obvious that the obj.a’s SuperRegion is obj. And so are obj.c’s and obj.b’s. More importantly, I know that the offset of obj.a to its SuperRegion(obj) equals 0, the offset of obj.b to its SuperRegion(obj) equals 4 and the offset of obj.c to its SuperRegion(obj) equals 8. So I wonder in Clang Static Analyzer, is there any functions or methods which developer can use to calculate the offsets of SubRegions to their SuperRegions? In other words, we can assume regionObjDotB is a MemRegion Object which stands for obj.b, is there any functions or methods like “regionObjDotB.getOffset()”? If no, how can I get the offsets of SubRegions to their SuperRegions?

In addition, I found a strange situation. Here are the codes.

1 struct LayerOne {
2 int a;
3 int b;
4 };
5
6 void func() {
7 struct LayerOne obj;
8 obj.a = 1;
9 obj.b = 2;
10 void* p = &obj;
11 p = p + 4;
12 ((int)p) = 200; // here p points to obj.b actually
13 p = p + 4;
14 ((int)p) = 99999; // here the area pointed by p is outside of obj!
15 }

After the analyzer analyzed the BinaryOperator(assignment operator) in line 12, I got the dump information:

BinaryOperator 0x4d1b0d0 ‘int’ ‘=’

-UnaryOperator 0x4d1b090 ‘int’ lvalue prefix ‘*’
-ParenExpr 0x4d1b070 'int *' -CStyleCastExpr 0x4d1b048 ‘int *’
-ImplicitCastExpr 0x4d1b030 'void *' <LValueToRValue> -DeclRefExpr 0x4d1afc0 ‘void *’ lvalue Var 0x4d1adc0 ‘p’ ‘void *’
`-IntegerLiteral 0x4d1b0b0 ‘int’ 200
LHS_sval: &element{obj,0 S32b,int}
LHS_region: element{obj,0 S32b,int}
sval: 200 S32b
size: 4 S32b
space: StackLocalsSpaceRegion
Base_Region: obj
Super_Region: obj

From the dump information above, I can know that obj.b has been assigned a value of 200. Does S32b mean signed-32bit? And since the obj.b’s offset is 4, why LHS_region is element{obj,0 S32b,int} rather than element{obj,4 S32b,int}?

After the analyzer analyzed the BinaryOperator(assignment operator) in line 14, I got the dump information:BinaryOperator 0x4d1b2b0 ‘int’ ‘=’

BinaryOperator 0x4d1b2b0 ‘int’ ‘=’

-UnaryOperator 0x4d1b270 ‘int’ lvalue prefix ‘*’
-ParenExpr 0x4d1b250 'int *' -CStyleCastExpr 0x4d1b228 ‘int *’
-ImplicitCastExpr 0x4d1b210 'void *' <LValueToRValue> -DeclRefExpr 0x4d1b1d0 ‘void *’ lvalue Var 0x4d1adc0 ‘p’ ‘void *’
`-IntegerLiteral 0x4d1b290 ‘int’ 99999
lhs_sval: &element{obj,0 S32b,int}
lhs_region: element{obj,0 S32b,int}
sval: 99999 S32b
size: 4 S32b
space: StackLocalsSpaceRegion
Base_Region: obj
Super_Region: obj

Here, my first question is similar, since the pointer p’s offset is 8 at that moment, why LHS_region is element{obj,0 S32b,int} rather than element{obj,8 S32b,int}? What makes me more confused is that in line 14, the destination pointed by p is outside of obj, but the dump information still shows “Super_Region: obj”.
Am I doing something stupid here?

Arthur

Hi, Arthur.

Hi Jordan,

I created a new SVal with the SValBuilder and then replaced the old SVal with the new one.

Now I have another question about MemRegion. Can I get the offset of a SubRegion to its SuperRegion? For an example, here is a struct object:
struct LayerOne {
int a;
char b;
int c;
} obj;
It’s obvious that the obj.a’s SuperRegion is obj. And so are obj.c’s and obj.b’s. More importantly, I know that the offset of obj.a to its SuperRegion(obj) equals 0, the offset of obj.b to its SuperRegion(obj) equals 4 and the offset of obj.c to its SuperRegion(obj) equals 8. So I wonder in Clang Static Analyzer, is there any functions or methods which developer can use to calculate the offsets of SubRegions to their SuperRegions? In other words, we can assume regionObjDotB is a MemRegion Object which stands for obj.b, is there any functions or methods like “regionObjDotB.getOffset()”? If no, how can I get the offsets of SubRegions to their SuperRegions?

If you want to know the offset from the base region, you can use the MemRegion::getAsOffset method. However, I would guess this isn’t actually the best way to solve whatever you’re trying to do; most of the analyzer is set up deliberately so that you don’t have to think about offsets (and in some cases should not). What are you trying to do?

In addition, I found a strange situation. Here are the codes.

1 struct LayerOne {
2 int a;
3 int b;
4 };
5
6 void func() {
7 struct LayerOne obj;
8 obj.a = 1;
9 obj.b = 2;
10 void* p = &obj;
11 p = p + 4;
12 ((int)p) = 200; // here p points to obj.b actually
13 p = p + 4;
14 ((int)p) = 99999; // here the area pointed by p is outside of obj!
15 }

After the analyzer analyzed the BinaryOperator(assignment operator) in line 12, I got the dump information:

BinaryOperator 0x4d1b0d0 ‘int’ ‘=’

-UnaryOperator 0x4d1b090 ‘int’ lvalue prefix ‘*’
-ParenExpr 0x4d1b070 'int *' -CStyleCastExpr 0x4d1b048 ‘int *’
-ImplicitCastExpr 0x4d1b030 'void *' <LValueToRValue> -DeclRefExpr 0x4d1afc0 ‘void *’ lvalue Var 0x4d1adc0 ‘p’ ‘void *’
`-IntegerLiteral 0x4d1b0b0 ‘int’ 200
LHS_sval: &element{obj,0 S32b,int}
LHS_region: element{obj,0 S32b,int}
sval: 200 S32b
size: 4 S32b
space: StackLocalsSpaceRegion
Base_Region: obj
Super_Region: obj

From the dump information above, I can know that obj.b has been assigned a value of 200. Does S32b mean signed-32bit? And since the obj.b’s offset is 4, why LHS_region is element{obj,0 S32b,int} rather than element{obj,4 S32b,int}?

I would guess the analyzer doesn’t bother to handle pointer arithmetic on type void*, which is technically illegal. Clang-the-compiler follows GCC’s lead in treating it like char*, though, so we should probably do that. Please file a bug.

Citations for legality: C11 6.5.6p2 (“one operand shall be a pointer to a complete object type”) and 6.2.5p19 ("[void] is an incomplete object type").

(And yes, S32b means “signed 32-bit”.)

After the analyzer analyzed the BinaryOperator(assignment operator) in line 14, I got the dump information:BinaryOperator 0x4d1b2b0 ‘int’ ‘=’

BinaryOperator 0x4d1b2b0 ‘int’ ‘=’

-UnaryOperator 0x4d1b270 ‘int’ lvalue prefix ‘*’
-ParenExpr 0x4d1b250 'int *' -CStyleCastExpr 0x4d1b228 ‘int *’
-ImplicitCastExpr 0x4d1b210 'void *' <LValueToRValue> -DeclRefExpr 0x4d1b1d0 ‘void *’ lvalue Var 0x4d1adc0 ‘p’ ‘void *’
`-IntegerLiteral 0x4d1b290 ‘int’ 99999
lhs_sval: &element{obj,0 S32b,int}
lhs_region: element{obj,0 S32b,int}
sval: 99999 S32b
size: 4 S32b
space: StackLocalsSpaceRegion
Base_Region: obj
Super_Region: obj

Here, my first question is similar, since the pointer p’s offset is 8 at that moment, why LHS_region is element{obj,0 S32b,int} rather than element{obj,8 S32b,int}? What makes me more confused is that in line 14, the destination pointed by p is outside of obj, but the dump information still shows “Super_Region: obj”.
Am I doing something stupid here?

No, apparently we are. :wink: Thanks for finding this.

Jordan