Uniformization of bounds checking

Uniformization of bounds checking

This post documents my plans for improving the checkers that report out-of-bounds errors.

@NoQ @Xazax-hun @steakhal @ anybody else:
If you have any high-level feedback (suggestions, concerns, plans etc.), please share it here. If this rough overview looks acceptable, I’ll dive into creating concrete commits and uploading them for review.

I also have one concrete question: What should the analyzer do if it sees pointer arithmetic that produces an out-of-bounds pointer (which isn’t the legitimate past-the-end pointer, but a “really” out-of-bounds one)? This is undefined behavior according to the standard, but intuitively isn’t as serious as actually accessing out-of-bounds memory.

Overview of relevant checkers

The checker security.ArrayBound (previously known as alpha.security.ArrayBoundV2) has sophisticated logic for determining whether the used memory location is out-of-bounds, but only applies this in the “basic” context where “use” means the subscript, dereference or arrow operator.

We also have several other out-of-bounds checkers:

  • alpha.security.ReturnPtrRange reports situations when a function returns an out-of-bounds pointer.
  • alpha.unix.cstring.OutOfBounds reports out-of-bounds access by C string functions (including raw memory handling like memcpy).
  • alpha.unix.cstring.BufferOverlap reports use of overlapping buffers in memcpy and similar functions. This is not exactly an out-of-bounds checker, but needs to perform similar checks, so perhaps it’s worth to think about sharing logic with it.
  • alpha.core.PointerArithm is currently NOT a bounds checker (it just reports non-array locations participating in pointer arithmetics) but does similar things and there may be plans to generalize it for some sort of bounds checking.
  • alpha.cplusplus.IteratorRange reports iterators used outside their valid ranges. (I’m not familiar with this checker, and AFAIK all the iterator checkers have serious technical debt.)
  • osx.coreFoundation.containers.OutOfBounds is presumably another checker that does bounds checking, but I know nothing about it.
  • Honourable mention goes to the function isArrayIndexOutOfBounds() within UndefResultChecker.cpp which is only used to add " due to array index out of bounds" to the “The left/right operand of ‘X’ is a garbage value” reports.

Currently (November 2025) none of these checkers use the logic of security.ArrayBound; and many of them use the “traditional” logic, which was used in the old alpha.security.ArrayBound (V1) before its removal. (See appendix 1 for a description of this “traditional” logic and its drawbacks.)

Anatomy of a bounds check

Bounds checking can happen in two kinds of situations:

  • Actual access where out-of-bounds offset triggers significant issues (e.g. pointer dereference, array subscripting, access through memcpy and similar functions).
  • Hypothetical access where the out-of-bounds offset is suspicious but doesn’t cause immediate problems (e.g. performing pointer arithmetic, returning a pointer, &array[idx] expressions).
    • In these situations it’s well-defined to create a past-the-end pointer.
    • Forming other out-of-bounds pointers is still undefined behavior (but is intuitively “less severe” than actually dereferencing them).

The bounds check can emit three kinds of output:

  • A. Bug report: when it finds an error, it should create a bug report:
    • the message should describe the accessed memory region, the index/offset (if concrete) and the extent of the memory region (if concrete and the error is an overflow)
    • the relevant values (index/offset, extent) should be marked as interesting if they are symbolic
    • if the memory area is not dynamically allocated, we should mark its declaration with a note tag (security.OutOfBounds doesn’t do this yet, but alpha.security.ReturnPtrRange has this feature and I think we should adapt it)
    • the error node may be fatal or non-fatal
      • actual out-of-bounds access is definitely fatal; and although hypothetical out-of-bounds access is also undefined behavior, perhaps it should be a non-fatal error node
  • B. Assumption: In some ambiguous situations (access may be legal, may be out of bounds), it is useful to assume that the access is legal and:
    • produce a new state where the index/offset is definitely in bounds;
    • attach a note tag that describes this assumption.
    • (see Appendix 2 for two code examples where the assumptions are helpful).
  • C. Nothing: In other cases (e.g. the access is definitely in bounds) the bounds check returns without doing anything.

To produce this output, the bounds checking algorithm consumes lots of input parameters:

  1. The ProgramState.
  2. The access kind: actual or hypothetical.
  3. Message fragment strings that describe the access operation (e.g. distinguish simple pointer dereference and a memcpy call).
  4. Strings that describe the accessed memory region (for message generation).
  5. The extent/size of the accessed memory region: either an SVal or a concrete number.
  6. The offset/index that is being accessed: either an SVal or a concrete number.

In many cases (e.g dereferencing a pointer) parameters 4-6. (accessed region, its extent, the offset) are all calculated from a single location SVal (i.e. pointer) by decomposing it into a {base region, offset} pair. However, there are also situations (e.g. applying the array subscript operator on an array with a statically known size) where these parameters are known separately, as e.g PR#159357 by Alejandro Alvarez proposes.

The bounds checking algorithm (as implemented in security.ArrayBound (V2) ) is composed of two or three subroutine calls:

  • the first one checks the lower bound (index >= 0),
  • the second checks the upper bound (index < size),
  • finally – if needed – the third one checks whether the pointer is a past-the-end pointer (index == size).

This comparison subroutine is itself nontrivial – in fact, I’d say that it is the core innovation of the V2 array bounds checker. The main difficulty is that it needs to approximate a proper mathematical comparison with evalBinOp – and evalBinOp messes up everything with overflows and type conversions. The current implementation is “good enough” for use in production, but I have several long-term plans that would improve accuracy in a few corner cases.

The algorithm is complicated by several additional aspects:

  • The offset/index and the extent/size may be measured either in bytes (char units) or as element counts. The current implementation converts everything to bytes for calculations, then converts back to element counts when it generates the messages (when possible), but it would be better to do the calculations/comparisons on element counts when it’s possible.
    • In many cases it would be better to e.g. assume $x < y$ instead of $4 x < 4 y$ (where the ‘4’ is sizeof(int)). The analyzer cannot see the connection between these two relationships (because strictly speaking, they are not equivalent due to overflow) – and $x < y$ is the one that would probably occur e.g. in a conditional statement.
    • If we convert the byte offsets to element counts (instead of the other way around), it would be more natural to handle cases where the extent is not a multiple of the element size (round down in the division). By the way currently the analyzer doesn’t report overflows if the beginning of the accessed object fits into the memory area.
    • Avoiding the back-and-forth conversion might also simplify the code.
  • When the offset or the extent is tainted, the analyzer should switch to a “pessimistic” mode where ambiguous situations are also reported. These reports need different messages and we are planning to reassign them to new separate checker(s) under optin.taint.*.
  • As the comparisons are made, the code must report the assumptions. This is needed for two things:
    • composing the message when the check finishes with an assumption (output B);
    • distinguishing “underflow or overflow” reports and “definitely overflow” reports.
  • There are also a few tricks that make the message easier to read.

Suggested architecture

As each bounds checker needs most of the complications (which are described in the previous section), it is important to reuse the same code in all of them. There are several “customization points” where the various checkers need to introduce different behavior, so the natural architecture would be a base class and several subclasses.

As a rough draft I propose the following interface:

struct PtrDiff {
  std::variant<NonLoc, int64_t> Amount; // or maybe just a NonLoc if we wrap known integers into nonloc::ConcreteInt
  std::optional<Type> ElemType; // nullopt means byte offsets

  bool isTainted(/* suitable context */) const;
}

// When a checker performs a bounds check, it constructs an instance of a
// subclass of this class, then calls the "main" method
// `performCheckAndTransition()` on it.
class BoundsValidatorBase {
protected:
  BoundsValidatorBase(CheckerContext &C, ProgramStateRef State, std::string
      AccessedRegionName, PtrDiff Extent, PtrDiff Offset)
    : /* ... save everything into data members ... */ {}

  ////// Customization points -- defined by subclasses /////

  virtual bool isActualAccess() const = 0;
 
  // Probably there will be several methods of this kind
  virtual std::string describeAccessOperation() const = 0;

  virtual const BugType *getBugType(bool isTaintReport) const = 0;

  // The constructor is also a customization point: subclasses should define
  // their own public constructors with different parametrizations.

  ////// Shared logic /////

public:
  // The "main" function: performs the bounds check, then emits the bug report
  // or assumption (if needed).
  void performCheckAndTransition(ExplodedNode *N);

protected:
  // The comparison subroutine which already exists in ArrayBoundChecker, but
  // should be generalized a bit to handle the PtrDiff type.
  static std::pair<ProgramStateRef, ProgramStateRef>
  compareValueToThreshold(ProgramStateRef State, PtrDiff Value,
                          PtrDiff Threshold, SValBuilder &SVB,
                          bool CheckEquality=false);
  
  StateUpdateReporter SUR;
}

I’m planning to introduce this gradually:

  1. Add a new source file and header in the Core (analyzer engine) directory and move the logic from ArrayBoundChecker.cpp to these new files. (NFC)
  2. Generalize and improve this new bounds checking logic to make it ready for use in other checkers (e.g. support doing the calculations with indices and not byte offsets).
    • These may slightly improve the quality of security.ArrayBound results as well.
  3. Reimplement the other bounds checkers as wrappers around the shared bounds checking library.

Appendix 1: The “traditional” logic

Several checkers (ReturnPtrRange, cstring.OutOfBounds, isArrayIndexOutOfBounds()) use the “traditional” logic, which is copypasted from the old alpha.security.ArrayBound (V1) checker. Note that this V1 checker produced too many false positives, so it was was deleted from the codebase when security.ArrayBound (originally alpha.security.ArrayBoundV2) became stable.

These checkers use the method ProgramState::assumeInBoundDual() as the “backend” of bounds checking, which relies on weird mathematical trick: to check 0 <= Idx < UpperBound it assume()s that Idx + MIN < UpperBound + MIN where MIN is the minimal value of the array index type (which is always long long).

  • This exploits that the analyzer models signed integer overflow as if it was well-defined.
    • I had vague plans for e.g. including an --assume-no-signed-overflow flag for the analyzer (because that would get rid of lots of false positives), but this abuse of the overflow modeling prevents that.
  • The comparison may be a signed comparison (by default) or an unsigned comparison (when either Idx or UpperBound is an unsigned long long, which enjoys precedence).
  • The mathematical trick is logically correct with a signed comparison, but with the unsigned comparison it sometimes produces incorrect results: e.g. Idx == -3 and UpperBound = 5ULL would produce an “in bounds” result.
  • If one of Idx and UpperBound is concrete, while the other is symbolic, then the analyzer (hopefully) can follow the additions and deduce the correct range for the symbol.
  • However, if Idx and UpperBound are both symbolic, then I’m pretty sure that the analyzer won’t see the connection between e.g. Idx < UpperBound (which could appear e.g. in an if) and Idx + MIN < UpperBound + MIN (which is assumed during the bounds checking).

I think it’s important to switch to using the approach of security.ArrayBound (V2) with these checkers. We should also consider eventually eliminating ProgramState::assumeInBoundDual() from the codebase because its tricky approach causes too many problems.

Appendix 2. Why emit assumptions?

Assuming that the access is in bounds (when it may be in bounds and the index is not tainted) can improve the quality of results. The following code show two concrete examples:

int TenElements[10];
int example1(int arg) {
  // We need to save our assumptions about the value of 'arg' to deduce that
  // these two can't be in bounds simultaneously:
  return TenElements[arg] + TenElements[arg + 10];
}

int example2(int arg, int arg2, bool flag) {
  if (flag) {
    TenElements[arg] = arg2;
  }
  // ...
  if (arg > 20) {
    // If an execution path where flag was true reaches this point, then the
    // program is already in an undefined corrupted state (after overflowing
    // TenElements) and this overflow is not modeled by the analyzer.
    // Results from these paths would be inaccurate, so it is better if
    // TenElements[arg] produces an assumption that prunes these paths.
  }
}

If I recall correctly @steakhal told that the additional assumptions previously caused slowdowns in some cases (by slowing down the Z3 refutation in outlier cases), but if I understand correctly that issue was mitigated in other ways since then so it isn’t relevant anymore.

Deduplicating and centralizing bounds checking sounds like a good idea to me. But is reusing a library in different checks the right way? Or could we keep all logic a single modeling checker that could potentially raise some events that other checks could respond to? Similar to how InnerPointerChecker and MallocChecker interacts. E.g., a checker can emit “hypothetical” or “actual” accesses with all the necessary inputs that would be validated by a single modeling checker.

Deduplicating and centralizing bounds checking sounds like a good idea to me. But is reusing a library in different checks the right way? Or could we keep all logic a single modeling checker that could potentially raise some events that other checks could respond to?

I’m confident that reusing a library is the most natural apporach in this situation, because the common logic is in the “middle” of the bounds checking procedure:

    1. different bounds checker reacts to different situations with different callbacks,
    1. then they call the same “is it in bounds?” logic,
    1. finally, if they find an error, then the bug report creation procedure (and the bug type and the checker reporting the error) is again different.

However, if you highlight relevant advantages of some other approach, then I’m open to switching to it.

Similar to how InnerPointerChecker and MallocChecker interacts. E.g., a checker can emit “hypothetical” or “actual” accesses with all the necessary inputs that would be validated by a single modeling checker.

I’m familiar with the logic of InnerPointerChecker and I’d say that it is more complex than the library that I’m suggesting. The “trick” of InnerPointerChecker is that in fact there are two separate checker frontends that are both called InnerPointerChecker:

  • One is the standalone checker in InnerPointerChecker.cpp which registers some callbacks and (within those callbacks) invokes some functions defined in MallocChecker.cpp – but never creates any bug reports.
  • The second is a checker frontend within the checker family MallocChecker which does the actual error reporting.

If we copied this model with bounds checking, we would introduce:

  • one standalone checker class for each bounds checking checker (ArrayBound, ReturnPtrRange etc.) – these registers callbacks to perform step 1; but never reports errors
  • a central checker family that contains the shared code (step 2.)
  • and within this family one frontend for each bounds checking checker (to actually report the errors, step 3.).

For use-after-free errors (which includes inner pointer invalidation) IIUC there are some checker callbacks that are used by multiple subcheckers naturally belong to the central checker family. (And it happens that except for InnerPointerChecker, the other use-after-free checkers are defined “inline” within the checker family and don’t have separate standalone parts.)

On the other hand, detecting out-of-bounds access doesn’t need custom state tracking (it only relies on numerical constraints and dynamic extent info), so each bounds checking checker relies on its own distinct callbacks and there wouldn’t be any shared callbacks that need to be in the central checker family of the bounds checking checkers.

This means that in the bounds-checking case the central checker family can be simplified into a plain library (like the one that I propose) if the error reporting simply uses the checkers that are already needed for their callbacks (instead of introducing the hacky “duplicated” checker frontends that share the name of the standalone checker).

IMO hoisting all of these implementations into something more generic makes sense.
Landing fixes would benefit all the rest.
I also liked that you want to provide some nobs to tune the behaviour.

Speaking of the convoluted behaviour of actual and hypothetical bounds checks I don’t have much to say. I probably lack the context right now to form a sound opinion.
What I definitely agree with is that it’s messy.

I also favour the removal of assumeInBoundDual().

I think the emission of the assumptions also need to be adjustable. We probably want to emit them, but I could see reasons also not to. In other words, in situations when the use of a given construct (such as subscript) shouldn’t be used to infer constraints because it’s a weak hint from the user. Maybe a tainted index would be an example. After the subscript I’d prefer not assuming bounds on the tainted index.

That’s right. They are mitigated.

The main reason I have mixed feelings about the proposed architecture is because we do not really use inheritance for sharing implementation across checkers that much at the moment. We have a slightly different model where the checkers subscribe to interesting events and observe the program state for interesting patterns. I think it would be great if we could come up with an API that does not feel very different from how we deal with the existing APIs.

So my idea would be more along the lines of:

  • Some accesses might not be clear from the source code, so checkers need to be able to signal it to the analyzer that an access at a certain offset is happening into some memory region (and whether it is actual or hypothetical in your terminology).
  • The analyzer have to be able to signal back if the access is safe, unsafe, or provably wrong and the checker needs to be able to act accordingly (e.g., generating sinks, reporting bugs).

For example, we do not have a base class for dealing with taints. There are some APIs for the checkers to query the taint information instead.

I just want to be sure if we diverge from the existing patterns there is a strong reason to do so and first we considered trying to build something that feels at home as in the style of existing APIs.

Thank you for raising the initiative, Donat!
I agree with the premise. I have one concern and one opinion to share.

I believe that in general, if we can sensibly continue simulation after encountering a bug, we should. This comes from many of our customers that are surprised to see first bug detected and second “missed”.

In this case, I agree with you: even if it is strictly speaking UB, continuing with the invalid pointer until it is actually de-referenced sounds sensible and should not lead to FPs. In my opinion, we should emit a non-fatal bug report and continue analysis with the invalid pointer.

Now my concern is with the architecture you propose. Admittedly, I did not spend enough time deeply considering your proposal, but I can’t easily see the need for inheritance here. On the other hand, inheritance makes coupling tighter and execution less intuitive. Why a set of library functions for sharing the bounds-checking logic not suffice?
Thinking superficially, I imagine we could have:

  • the bounds checking function, similar to assumeInBounds but the better implementation as found in ArrayBoundCehcker now.
  • something like “StateRef introduce-in-bounds-assumptions(State, …)” function that interested checkers could call when relevant.
  • a bug-report visitor that adds the notes for the bounds-related assumptions.

My intuition is that if nothing is tainted, then for a memory access event X the following two logical properties are equivalent:

  • if X is definitely out of bounds, then we want to produce a bug report;
  • if X is ambiguous (may be in bounds, may be out of bounds), then we want to emit an assumption that X is in bounds.

I do not know about any clear counterexample X where one of these is true but the other is not (if we assume that nothing is tainted), so I don’t see a reason to adjust the emission of the assumptions independently of the emission of bug reports.

However, if we find concrete use-cases where it is useful to adjust the emission of assumptions, I’m open to adding customization logic (and it wouldn’t be too difficult to do so).

Taint is a completely separate aspect, which I didn’t detail in the original write-up. I’m sorry if this caused confusion.

If the index is tainted, the ambiguous bounds checks (may be in bounds, may be in bounds) will produce “tainted index may be out of bounds” reports instead of emitting “assuming index is in bounds” assumptions – so there is “no place” for producing assumptions. (This behavior is already present in security.ArrayBound and I don’t plan to change it.)

<offtopic>
There are some corner cases, e.g. a situation where the extent of the memory area is tainted, so we want to emit an assumption that the (non-tainted) is in non-negative, then emit an “index may be out of bounds and the extent is tainted” warning (which will be a non-fatal error node, because taint warnings are customarily non-fatal). This is not yet implemented, but seems to be the logical continuation of the existing logic.
</offtopic>

Thanks for the update!

1 Like

Perhaps I wasn’t clear enough, but I’m not proposing inheritance between checker classes (which have singleton instances and are derived from Checker and related classes), instead I’m planning introduce small utility classes and inheritance between them.

I think what I’m proposing is roughly analogous to the existing patterns. Perhaps the most clear analogy is e.g. the BugReporterVisitor API: if a checker needs to use it, it defines a custom subclass of a utility class, implements some virtual methods to describe the custom behavior, then instantiates this subclass to use its functionality.

I think going in this direction naturally leads to my proposal (or something very similar to it):

  • As you say, we want to let the checkers call some “is this access out of bounds?” API that – very roughly – looks like validateBounds(MemRegion *Region, NonLoc Offset, bool IsActualAccess, /*some other arguments*/).
    • In fact, for the sake of uniformity, this API should handle all bounds checking, not just those that are “not clear from the source code”.
    • The name of this function is just a placeholder, I’m not proposing it for actual use!
  • The various bounds checkers will need to emit the assumptions and bug reports in an almost identical fashion – the only difference is that each uses its own bug type and some parts of the messages (which name the access operation) are specific to each checker.
    • Steps that are need to be done in an identical fashion include: composing the messages, marking SVals as interesting, generating the error node or the node with the new assumption etc.
  • So instead of “signaling back” and “the checker needs to be able to act accordingly” I propose that the checker just passes its BugTypes and message fragments to the library function validateBoundsForMemoryAccess, which will arrange everything
  • The logic of validateBoundsForMemoryAccess is fairly complex, so it would be inelegant to put it into a single monolithic function, so I’m planning to split its logic into cooperating methods of a utility class.
    • I suggest methods instead of standalone functions because I expect that it would be cumbersome to pass context and state between functions.
    • Instances of this utility class are lightweight and ephemeral: a fresh one is created for each memory access.
  • There will be some logic (e.g. the is this subscript expression wrapped in an address-of like &arr[idx] check) which is only relevant for one particular bounds checking checker, and I think it will be natural to place these into methods of subclasses of the bounds validator utility class. (There are other approaches, e.g. passing lambdas to the constructor of the utility class, but I expect that this will be simpler than those. However,

The tight coupling between the steps is intentional. After identifying the memory access, all the bounds checking checkers need to follow the same rigid workflow until the very end (up to and including the creation of the bug report). I usually don’t like inheritance, but I think it is still the most idiomatic tool for representing this rigid skeleton that has a few customization spots (represented by virtual methods that are overridden in subclasses).

However, the workflow is so rigid that the inheritance is not absolutely necessary: the API could consist of one big fat function with roughly a dozen parameters. (Half of those parameters would be “real” ones that change from call to call, while the rest are callbacks or message fragment strings that encode the “static” differences between the checkers.)

Right now I expect that the solution with inheritance would be more elegant, but the “one big function” is also acceptable, and it may turn out to be the more appealing approach when I dig into the details of the implementation.

This is completely infeasible, because these results (“is it in bounds?”, the updated State, the note that describes the bounds-related assumption) are all outputs of the same monolithic algorithm, and even if you ask for just one of them, you need to run the full algorithm.

(Running the algorithm multiple times is not feasible for performance reasons. While profiling for other reasons, I have accidentally seen an example translation unit where security.ArrayBound was responsible for more than 1% of the full analysis runtime, which is similar to other checkers that check commonly used language features, but too much to be tripled.)