I’m about to catch a bug regarding the modeling of copies of structs, which involves a bunch of SymbolicRegions
and LazyCompoundVals
along with understanding the RegionStore
.
During my investigation, I found some interesting behavior relating the handling of SymbolicRegions
.
One hand, they are used as if they represent some untyped memory, but in other cases we dig up the wrapped symbol and use that’s type as the type of the SymbolicRegion
.
That being said, it feels like by reading the code SymbolicRegions
are somewhere between being typed and untyped.
My questions are, why is the SymbolicRegion
not a TypedValueRegion
?
Why do we pretend that it’s a typed region in some situations?
Is the following statement true? For all Sym of type T*: SymRegion{Sym} is equivalent with Element{SymRegion{Sym}, 0, T}
If so, shouldn’t we canonicalize them?
Let’s gather some comments and code snippets regarding the typed-ness of SymbolicRegions
:
- The last sentence of the doc comment of
class SymbolicRegion
:
/// SymbolicRegion - A special, “non-concrete” region. Unlike other region
/// classes, SymbolicRegion represents a region that serves as an alias for
/// either a real region, a NULL pointer, etc. It essentially is used to
/// map the concept of symbolic values into the domain of regions. Symbolic
/// regions do not need to be typed.
-
This class inherits from
SubRegion
, so it is not an instance ofTypedValueRegion
- matching the comment above. -
The class always wraps a symbol (the address of the memory within the abstract machine’s memory), which by nature always typed.
The presence of this symbol is enforced by an assertion in the constructor:assert(s && isa<SymbolData>(s))
wheres
is aSymbolRef
. -
There are a few cases already, where we dig up the symbol wrapped by the
SymbolicRegion
and use the type of the symbol as an approximation for the type of theSymbolicRegion
.
ExprInspectionChecker::analyzerDumpElementCount()
:
QualType ElementTy;
if (const auto *TVR = MR->getAs<TypedValueRegion>()) {
ElementTy = TVR->getValueType();
} else {
ElementTy =
MR->castAs<SymbolicRegion>()->getSymbol()->getType()->getPointeeType();
}
At MemRegion.cpp::calculateOffset()
there is this snippet of code & comment about handling SymbolicRegion
s:
if (const auto *TVR = dyn_cast<TypedValueRegion>(R)) {
Ty = TVR->getDesugaredValueType(R->getContext());
} else if (const auto *SR = dyn_cast<SymbolicRegion>(R)) {
// If our base region is symbolic, we don't know what type it really is.
// Pretend the type of the symbol is the true dynamic type.
// (This will at least be self-consistent for the life of the symbol.)
Ty = SR->getSymbol()->getType()->getPointeeType();
RootIsSymbolic = true;
}
RegionStoreManager::getBinding()
:
if (!isa<TypedValueRegion>(MR)) {
if (T.isNull()) {
if (const TypedRegion *TR = dyn_cast<TypedRegion>(MR))
T = TR->getLocationType()->getPointeeType();
else if (const SymbolicRegion *SR = dyn_cast<SymbolicRegion>(MR))
T = SR->getSymbol()->getType()->getPointeeType();
}
assert(!T.isNull() && "Unable to auto-detect binding type!");
assert(!T->isVoidType() && "Attempting to dereference a void pointer!");
MR = GetElementZeroRegion(cast<SubRegion>(MR), T);
} else {
T = cast<TypedValueRegion>(MR)->getValueType();
}
In this case, it seems like we are wrapping the SymbolicRegion
into an ElementRegion
just to ‘make it typed’.
It doesn’t feel like a good thing to do when we should seek to have a canonical representation (keys) in the RegionStore
to make lookups consistent with the binds.