While @steakhal is most likely correct about symbol reaper, I think you’re missing a bigger point here:
Even though value of variable res
cannot be -1
anymore, the symbol returned by myFunc()
may still be equal to -1
.
That’s what a symbol is. It’s a named value. When the variable is overwritten with a new value, the symbol still refers to the old value.
We’re basically approaching static analysis like a math problem for kids. Suppose the problem goes like “There were some apples on the plate, then Johnny ate 3 of them, and there were 4 left, how many apples there were initially?”. Then you start solving it by saying “Let X be the number of apples on the plate”. Then when Johnny eats 3 apples, X no longer represents the number of apples on the plate. The number of apples on the plate has changed to X - 3, but X itself didn’t change, it still refers to the original amount of apples. In this example “X” is what we call a symbol, whereas “number of apples on the plate” acts like a variable. This is what makes the symbol useful, as we can now have two names for the same value: X - 3 = 4, which gives us an equation that we can solve for X.
Static analyzer works the same way. It gives names to unknown values, then taking different branches in the program gives us equations over these values (“we took false branch, therefore the symbolic expression for the branch condition is equal to zero” - an equation!). If this system of equations has at least one solution, then the state is non-null and therefore the branch is worth exploring. Or, if we’re at division by symbolic value V, and adding an extra equation “V is non-zero” causes the system to have no solution, then there’s division by zero.
Now, to your specific problem. Even though your code makes sure that res
isn’t equal to -1
, this really isn’t important to your problem at all. You want to make sure that the user checks for errors returned from myFunc
before using the buffer. But overwriting the variable in which the error code is stored isn’t a valid way to check for error in the first place! Overwriting the variable doesn’t undo the harm of using the buffer, it’s just an ostrich burrowing his head in the sand and thinking he becomes invisible this way. The code doesn’t squeeze 42 back into myFunc()
, it simply throws away the actual error code and acts as if the error didn’t happen. So you really don’t care whether the code under analysis overwrites the variable or not. This is why symbols are useful and work exactly as intended!
Basically, in the following example:
char * buf, *buf2;
int res;
res = myFunc(&buf);
res = myFunc(&buf2);
if (res == -1)
exit(1);
buf[0] = 'x';
even though variable res
was checked for -1
, the check for -1
wasn’t checking the right symbol, so it’s not a valid check and the warning should still be emitted. By tracking the map from region to symbol, as opposed to a map from region to variable, you achieve this automagically.
Now, back to your initial example, it depends on specifics whether it deserves a warning or not. The code clearly checks for -1
, but it doesn’t accomplish anything to avoid the problem later on. Depending on what myFunc
does, this may or may not be problematic. Say, if myFunc
simply fills the buffer with garbage data, overwriting the buffer with some actual data isn’t bad at all, the code should be allowed to do that. But if myFunc
calls free()
on the buffer, then yeah, there’s no way to use it correctly ever.
So in the former case, where there are valid ways to use the buffer, you may want to settle on a weaker check that simply verifies that the error code was checked at all. This would be the check that doesn’t warn on your initial example. Then the question “Can that symbol be equal to -1?” becomes the wrong question to ask. The right question would be “Are there any equations over that symbol present in the state?”, regardless of their solutions. In order to answer such question, you can directly inspect the equations (the internal data structures of RangeConstraintManager), or you can subscribe to checkBranchCondition
to find relevant branch conditions. If you go this way, you’ll sometimes notice that the symbol disappears from the state completely without branch conditions. If this happens, this means that there’s no way for the program to check the symbol anymore, so you can clean it up from your state and instead mark any future use of the buffer as unchecked. In this case checkDeadSymbols
will helpfully inform you that the symbol is “garbage-collected” from the state, and you can refer to @steakhal’s answer for more insights on that.
In the latter case, you’re asking the right question with “Can that symbol be equal to -1?” and your initial example is an example of a valid warning that should not be removed. But you can still need to handle dead symbols correctly; if the symbol is “dead”, then all equations it was involved in may also be forgotten. In this case checkDeadSymbols()
is, again, the last moment of time where you need to make a decision about that symbol, and it’s also the best moment of time to do so, as it’s proven that you won’t get any more information about that symbol. So, again, you’ll need to erase it from your state, and instead warn on every use of the buffer from that point onward.