[RFC] RegionStore++

I really like this. The proposal makes sense to me to extend the store with region extents.

I already picked D132752 diff 14, and observed some test cases to be fixed by it. However, I also saw some regressions as well. I did the investigation and it turned out that checker callbacks for accepting some pointers could be affected by this proposal.
In the past, if a pointer was cast to char*, the checkers such as the CStringChecker will mimic a load through that pointer and possibly bind checker-specific metadata to that symbol. In our case the resolution was to load from the base region instead, so we are not loading a single char but the whole entity, and we have our canonical symbol to which we can bind our metadata. So, it turned out to be a “bug” in our checker, but I suspect that other checkers could suffer from the same problem so I leave this comment here. After we fixed this, we had no regressions in the testsuite.

On the actual projects, we also saw a considerable increase of uninit reads, and this was expected. However, most of these new reports were FPs, which is not good, and suggests problems with the change (D132752).


Let me move away from this proposal and introduce some terminology we came up with during internal discussions to aid in explaining how we see the store should be implemented.

Terms and definitions

First of all, it would be nice to have some helper methods for the bindings:

  • B contains B' (reflexive, transitive, antisymmetric)
  • B overlaps with B' (reflexive)

This helps to explain situations in how B relates to B'.

In the discussions, it helped a lot to categorize bindings into these 4 sets:
We took each binding S in the cluster such that B.overlapsWith(S):

  • Exact: S == B
  • Parents: S.contains(B)
  • Children: B.contains(S)
  • Uglies: !S.contains(B) && !B.contains(S)

Note that these sets are important only for discussing the scenarios, we probably don’t need to materialize any of these as part of an implementation.

To keep the implementation simple, it makes sense to have a memory model where bindings “nest”. In formal terms, for all pair of bindings (B1, B2) in a cluster, if B1.overlapsWith(B2) then B1.contains(B2) || B2.contains(B1). Or in other terms, the cluster never contains ugly bindings.

Example for good stores, aka. where bindings nest:

[.............]    [..]
  [..]  [.]

Example for a bad store:

[.............]    [..]
  [..]  [.]  [......] // <-- bad: overlaps but not contains

Note that the store we currently had, nests, because default bindings doesn’t have an extent, so they are in effect from their start. We should preserve this invariant.
User code might want to add bindings violating this because they disrespect the strict-aliasing rule or do an unaligned store; but in practice, we should get away with this, and a potential checker catching those UB cases would justify this invariant anyway.

Now, let’s discuss loading and storing some values.

Store to binding B ([start, end]):

let UknownStart := B.start
let UnknownEnd := B.end

for each binding S such that B.overlapsWith(S):
  if (B.contains(S)): // This includes if S == B, aka. overwriting the same binding
    remove(S)
  if (S.contains(B))
    continue;

  // We should not introduce "ugly" bindings, so we need to do something about them.
  // We could "split" existing bindings in a way to ensure the new binding "B" would still "nest", but in general it's probably not possible and would complicate things a lot for little value.
  // Alternatively, let record what range we need to "invalidate", and let's just bind "Unknown" to that range before storing "B".
  UknownStart := min(UknownStart, S.start)
  UnknownEnd  := max(UnknownEnd,  S.end)

// If we have to invalidate something, bind "unknown" there.
if (!B.contains([UknownStart, UnknownEnd])):
  addDefaultBinding([UknownStart, UnknownEnd], UnknownVal)

// Finally, just add binding "B".
addBinding(B)

Note that in the real implementation, we might want to “collapse” the introduced “unknown” ranges, but we don’t necessarily need to. To illustrate this, consider adding B3, then B4 in this example:

Let's add B3:
[     B1     ][   B2   ]
         [     B3      ] <-- adding this
=> result:
[       unknown        ]
         [     B3      ]

Let's add B4:
[       unknown        ]
         [     B3      ]
   [   B4   ] <-- adding this
=> result:
[       unknown        ]  <-- we should collapse these
   [      unknown      ]  <--
   [   B4   ]

Loading from a binding

Loading is slightly more tricky because we have LazyCompoundVals. in some cases.

Loading from binding B where the type of B is non-primitive (array, struct, union, etc):

let NumOverlaps := the number of bindings S of the cluster such that B.overlapsWith(S)
if (NumOverlaps == 1):
  let C := the single binding for which B.overlapsWith(S)
  if (C == B && the value of C is a LazyCompoundVal):
    return that LazyCompoundVal

return a new LazyCompoundVal for the corresponding region matching the offset of B. If there is no clear match, just return Unknown.
// By returning LCVs, we basically "defer" the load until we either want to load a direct binding of a field (subregion) or we want a default binding of a single value.

Loading from binding B where the type of B is a primitive type (e.g. int).

Let’s start with an evil example:

[    default   ]                [    default   ]
        [  D1  ]
        [    unaligned load of 'int'   ]

Let’s categorize our bindings into 4 categories that were introduced in the first section and use our terms.
So, take bindings S in the cluster such that B.overlapsWith(S), and form our “Exact”, “Parents”, “Children” and “Uglies” sets.
Note that if we have “Exact” or “Parents” then we cannot have any “Uglies” due to our invariant.

Let’s focus on the common cases first.

if (not Children and not Uglies):
  if (Exact): return the corresponding value
  if (Parents):
    let P := the smallest Parent binding
    if (P is a default binding):
      return the default value or a LazyCompoundValue for the subregion.
   else: // if P is a direct binding
    return a derived symbol for that subsymbol or just Unknown for the cases when we would need to split concrete values.

// Otherwise, we have some "Child" or "Ugly" bindings partially covering
// the region we want to load from, so at best we can return "Undefined" or "Unknown".
// We should return "Undefined" if any of the bytes referred by binding
// "B" is "Undefined", otherwise return "Unknown.
// This can be determined in a single pass, sweeping through the store,
// determining what bindings cover what parts of "B".
// We should also respect the memory space of the cluster, as in globals,
// if we don't have a binding to a region, we can assume it's initialized, and
// return "Unknown" in that case. This should be different for automatic storage
// duration objects, where it should resolve to "Undefined" instead.
// Let's call this the "sweep".

This seems really fuzzy, so we decided to sketch up an example implementation of the “sweep” algorithm mentioned previously. The algorithm works in a single pass, so it’s O(N) w.r.t the number of bindings in the cluster, and gives you the value for each subsection of the binding B. Example:

Cluster cluster{{{0, 4}, "T1"},                     {{5, 14}, "T2"},
                {{1, 4}, "C11"},                    {{6, 8}, "C21"},  {{8, 10}, "C22"},
                {{1, 3}, "C111"}, {{3, 4}, "C112"}, {{6, 7}, "C211"}};
0   1     3    4  5    6    7    8        10      14
[      T1      ]  [             T2                 ]
    [    C11   ]       [   C21  ][   C22   ]
    [C111][C112]       [C211]

sweep([1,3]) should return a list like this, where it says, starting with a given offset, what is the value until the next mentioned offset. By using this list, one can reconstruct the exact values that cover the binding B, [1,10] range in our case:

1 -> [1-3] C111
3 -> [3-4] C112
4 -> MemSpace
5 -> [5-14] T2
6 -> [6-7] C211
7 -> [6-8] C21
8 -> [8-10] C22

One could also use the result of sweep to answer the common part of the load, as the result already encodes them.

Many thanks to @Tomasz for the hard work in discussing the RFC, coming up with the mental model, and the concept implementation of the sweep() demonstrated here.