Clang Static Analyzer follows the execution of the analyzed code by maintaining a mapping from symbols [1] to RangeSet
s that represent the potential values. (I know that we also have Z3, but if I understand correctly standalone Z3 does not work in practice.)
There are some checkers (e.g. ArrayBound checking) where it would be useful if the error message reported the inferred range of certain values (e.g. the index causing the overflow). However, the ConstraintManager
interface is really restrictive, and it doesnāt provide any public method that would expose something like RangeConstraintManager::getRange
. In addition to performing assumptions, the checkers can and do extract concrete values when some symbol happens to be constrained to a single value, but otherwise they are limited to awkwardly phrased, cryptic messages.
As a concrete example, after my recent fixes alpha.security.ArrayBoundV2
can emit messages like
Access of āarrayā at index 11, while it holds only 10 āintā elements
in cases when the concrete index is known, but if we only have an assumption like idx >= 12
then it cannot say more than
Access of āarrayā at an overflowing index, while it holds only 10 āintā elements
I would like to expose a getRange()
-like method in the ConstraintManager
interface to be able to produce better error messages like
Access of āarrayā at index >= 12, while it holds only 10 āintā elements
To remain compatible with the (mostly theoretical) non-range-based constraint managers, I could allow nullptr
/nullopt
as a ārange information not supportedā dummy return value from the public getRange()
method.
I know that the individual assumptions of the checkers are also reported as note tags, but the bug path is very noisy (it can contain dozens of irrelevant if
s) and itās common that the final inferred range is composed from several elementary assumptions, so I think itās important to report the details of the āfinal verdictā to the user (in addition to the bug path).
@NoQ @Xazax-hun @steakhal What do you think about this change? May I add a method that approximately looks like RangeSet getRange(ProgramStateRef State, SymbolRef Sym)
to the ConstraintManager
interface? Do you happen to know about any invariants, design goals, pitfalls, tricky details etc. in this area that I should pay attention to?
[1] More precisely: equivalence classes of symbols that are known to be equal at that point.