[RFC] Overflow Idiom Exclusion

Summary

I’ve been working on a feature set to disable sanitizer instrumentation for common overflow idioms. For a wide selection of projects, proper overflow sanitization could help catch bugs and solve security vulnerabilities. Unfortunately, in some cases the integer overflow sanitizers are too noisy for their users and are often left disabled. Providing users with a method to disable sanitizer instrumentation of common patterns could mean more projects actually utilize the sanitizers in the first place.

Background

One such project that has opted to not use integer overflow (or truncation) sanitizers is the Linux Kernel. There has been some discussion recently concerning mitigation strategies for unexpected arithmetic overflow. This discussion is still ongoing and a succinct article accurately sums up the discussion. In summary, many Kernel developers do not want to introduce more arithmetic wrappers when most developers understand the code patterns as they are.

Patterns like:

if (base + offset < base) { ... }

or

while (i--) { ... }

or

#define SOME -1UL

… are extremely common in a codebase like the Linux Kernel. It is perhaps too much to ask of kernel developers to use arithmetic wrappers in these cases. For example:

while (wrapping_post_dec(i)) { ... }

… which wraps some builtin would not fly. This would incur too many changes to existing code; the code churn would be too much, at least too much to justify turning on overflow sanitizers.

User Cyberax (from lwn.net) probably shares a similar question to you:

I don’t understand why Linux developers won’t just annotate the expected wraparounds. They are used somewhat frequently, but not so much as to affect the readability. Most source code files in the tree won’t have any annotations at all.

but then realizes:

“I think, the kernel developers just like to be able to read the overflow-dependent code idioms.”

Which beautifuly summarizes the problem. Anyways, the reality is: it’d be nice if we could turn on overflow sanitizers for the linux kernel but there would be far too much noise caused by overflow-dependent code idioms. We need to disable instrumentation of these idioms – and leave the source code intact.

Examples

Currently my feature set tackles three common idioms:

  1. if (a + b < a) or some logically-equivalent re-ordering like if (a > b + a)

  2. while (i--) (for unsigned) a post-decrement always overflows here

  3. -1UL negation of unsigned constants will always overflow

All of which are disabled when -fno-sanitize-overflow-idioms is ticked on. Check out the tests from the WIP implementation for more info (section below this one).

Implementation Details

Check out what I’ve got here: GitHub Branch

The first type of idiom if (a + b < a) is implemented as an IR optimization pass. I tried to get my pass to run earlier than most so I can find IR patterns before they are altered too greatly. Once the pattern is identified, I alter the control flow graph by removing the edge to the overflow handler. The hope is that future optimization passes can eliminate the dead code.

The second and third type of idiom while (i--) and -1UL are implemented during clang CodeGen within: clang/lib/CodeGen/CGExprScalar.cpp.

For what it’s worth, @kees and I have done some integration testing with the Linux Kernel and things seem to be working smoothly, idioms are being excluded!

Please let me know if there are better ways to achieve what we’re trying to achieve, I am especially new to the new PassManager and IR optimizations in general. Do note that this is heavily WIP and depending on when you take a look there may be some unused code – I’ve tried to make it as reviewer-friendly as possible.

2 Likes

The general idea makes sense.

Why is the a+b<a check an IR optimization? I’d prefer to keep it in clang if possible… once optimizations start running, everything becomes a lot weirder to deal with. Trying to prove the two arbitrary expressions are equivalent is hard in general, but simple cases are easy: for example, it’s trivial to check for two DeclRefExpr referring to the same variable. Users shouldn’t be writing complicated expressions for “a” in that idiom anyway.

If you really can’t make it work at the AST level, the IR pass needs to be significantly modified. You can’t assume all calls to uadd_with_overflow are actually generated by ubsan. You’d need a dedicated intrinsic for ubsan-generated overflow checks.

1 Like

I really want to keep it in Clang. I could not find a universal way to do this in the AST alone. At the time the overflow-checked binary operator is emitted I don’t think I have enough context to match the pattern against. If you think it is definitely possible then I will certainly revisit it (and change this RFC’s topic to clang only).

This seems like an entirely reasonable idea, especially if it can be done in the frontend only.

Having it be purely-Clang-frontend-based would have another extremely nice side-benefit: we could also teach it to emit a diagnostic with a fixit attached for what the user should have written, if they wanted to write their code to be standards-conforming instead of intentionally depending on nonstandard behaviors.

…although unsigned overflow checking with -fsanitize=unsigned-integer-overflow (which I guess the kernel wants to use) makes that a bit weird, since unsigned overflow is standards-conforming, unlike the if (a + b < a) “overflow checking” silliness.

What would we even want to recommend that someone replace unsigned long x = -1UL; with?

At the time the overflow-checked binary operator is emitted I don’t think I have enough context to match the pattern against

You should be able to add handling to the emission of BO_LT; if the LHS is an add in the specific form you’re looking for, use an alternate codepath to emit the add, instead of just recursively visiting the BO_Add.

~0UL?

2 Likes

The kernel uses all kinds of magic negative constants. I did some passes trying to change these a few months ago and it got really cumbersome. E.g. ULONG_MAX - 1 being -2UL, etc. There would be (fairly reasonable) push back saying, “but the old way works and is more readable”. :slight_smile:

The way I’ve been framing all of this in my mind is about whether a given code pattern is expecting to handle the overflow (if so, the sanitizer needs to avoid instrumenting it). The a + b < a case shows this well: the operation is literally checking for the overflow. while (var--) explicitly wants to hit the bottom of the range, and the overflow is an intended outcome. -3UL was deliberately trying to offset from the type’s max value, etc.

Anyway, I’m obviously a fan of this series. It’s in line with what has been requested from the Linux dev community after working through our earlier proposal (detailed in the LWN article Justin linked to).

1 Like

@efriedma-quic @jyknight

I think I am on the right track with frontend-only if (a + b < a) approach (i.e: the tests that passed with my previous IR-based optimization approach still pass). I’ve ripped out the non-clang code now too :slight_smile:

I’ve managed to get the implementation down to about 150 lines too. I’m opening a PR later this week. I’ll ping here again when it’s up.

Here’s the PR. I would appreciate some review!