Iterator Checkers: Understanding Bindings


I think I am on the good path to create iterator checkers for the Clang Static Analyzer. I already executed some experiments successfully, however to do that I needed to “hack” the infrastructure. Now the next step is to create a proper solution instead of the hacking which is unacceptable and would probably brake the system.

My first iterator checker is a very simple checker that tries to recognize cases where an iterator is dereferenced or incremented past its end. To do that I first catch the symbolic value of calls to end() functions that return an iterator type. At this point (checkPostCall) the return value is symbolic which is good. The next step is to post-check comparisons but here I get a temporary instead. The good news is that this temporary is default bound to the same symbol I got from the return value of end(). The bad news is that this default binding is buried and irretrievable in the current Static Analyzer core.

The key function to retrieve binding is RegionStoreManager::getBinding() in RegionStore.cpp. However, this function only retrieves direct bindings. I could not find too much documentation about the two types of bindings, I could only find out the default binding is used for structure members where the members are not direct bound but the structure itself has a default binding. I found nothing about what should happen when retrieving the binding for the structure itself where it only has a default binding.

The situation is even worse here. The name of function getBinding() is a bit misleading because it does not even retrieve a direct binding even if it exists. In case of structures or class types it creates a lazy compound value even if the binding exists. Why is this behavior? The lazy compound value then can be used to get back the original structure/class, thus the temporary. This is just a round trip and does not help to retrieve the binding, even if this would be a direct binding. After an assignment of the return value to a variable, checking the variable (e.g. as an operand of a pre- or postCall checker) a symbol is of course bound to the variable as well. However in this case the binding could be retrieved if it would be a direct binding, because in this case it is not considered a struct/class anymore but as a variable region.

Could somebody help me to understand why bindings behave this way and what to change to be able to retrieve these default bindings? The are essential for all iterator checkers which are needed by many projects.

Thank you for your cooperation in advance!