Hi all,
This RFC is about how we can make LLVM’s optimizations preserve more potential bugs that can be caught with the sanitizers that are part of LLVM (especially ASan, MSan and UBSan).
TL;DR: The optimizer (correctly) transforms code without considering the potential sanitizer checks that get removed by the transformation. Given that sanitizers are often combined with the optimizer for good reasons, should we have an automatic pipeline/mode that selectively disables these few ‘dangerous’ transformations.
Background
Clang/LLVM comes with several different sanitizers that automatically inject checks into compiled programs. These checks catch undefined behavior or other kinds of bugs during execution and then report them in some way (usually by printing an error and then aborting the execution). Sanitizers are successfully used in various ways from hardening software in production to finding bugs by enabling them when doing manual testing or fuzzing.
Sanitizers are also frequently used in combination with LLVM’s own optimization pipeline. For example, the speed at which fuzzers can find bugs is limited by the execution time of the program, so usually fuzzers test a sanitized and heavily optimized version of the target program. But manual testing is also usually done with sanitizers and optimizations enabled because it simply reduces the amount of computing resources spent on running tests. One example in the wild is Chrome’s documentation which recommends turning on the different sanitizers as well as compiling in release mode (which enables optimizations).
Mixing Optimizations and Sanitizers
The general assumption that users have when turning on both sanitizers and optimizations is that they end up with a fast program that triggers the same sanitizer reports as the slow unoptimized program [1]. In reality however the optimized program might end up producing fewer or no sanitizer reports[2].
There are two main reasons for why an unoptimized program with a sanitizer failure will actually pass after optimizations:
- A sanitizer check that is probabilistic now just passes by accident. For example, an load from an out-of-bounds pointer with ASan might point to a different but unpoisoned memory region in the optimized version, while in the unoptimized version it accesses poisoned memory.
- Code containing a sanitizer check and ends up ‘being’ fixed because an optimization transforms code in a way that is only equivalent in a program that is free of undefined behavior. For example, SimplifyLIbCalls currently replaces the pattern
strlen(x) == 0
with*x == ‘\0’
which is a correct transformation in a well-formed program. But ifx
has no null terminator this does an out-of-bounds read, so the code is not equivalent when considering potential sanitizer failures (in this case ASan).
This RFC is only about missing sanitizer reports that fall into the second category.
[1] Or at least passes/fails in the same way as the unoptimized sanitized binary, so getting a different sanitizer failure should be fine.
[2] It might also produce more, but that’s a different issue out of scope.
Current situation
So I looked at UBSan, MSan and ASan and tried to make various failing programs pass by simply optimizing them. The end result was about 15 cases where we end up with an optimizer pass ‘fixing’ a failing program even with the respective sanitizer enabled. I’m about to file the respective bugs on GitHub within the next week(s) and then list them below as I file them. I’ll link to an existing dupe if I can find one:
- UBSan check optimized away in function marked with __attribute__((const)) · Issue #55225 · llvm/llvm-project · GitHub
- MSan miss on O1 optimization level (for CVE-2016-20014) · Issue #55595 · llvm/llvm-project · GitHub
- InstCombine strlen(x)==0 folding removes ASan heap-buffer-overflow failure · Issue #55525 · llvm/llvm-project · GitHub
- Double-free not caught by ASan unless -fno-builtin is specified or optimizations disabled · Issue #55590 · llvm/llvm-project · GitHub
- asan/lsan may fail to catch "obvious" bugs due to malloc/new being optimized away · Issue #374 · google/sanitizers · GitHub (already filed)
- memcpy -> load/store transformation removes overlapping memory area checks. · Issue #55678 · llvm/llvm-project · GitHub
Most bugs can be grouped into the following categories:
- Generic optimizations that just remove ‘dead’ code if the only possible side effect is UB (e.g., dead stores/reads, etc.).
- InstCombine/SimplifyLibCalls simplifying libc/builtin calls in a way that doesn’t preserve UB (removing two sequential free calls on a pointer).
- Several passes transforming code related to undef values (this mostly affects MSan).
Also most of the bugs only affect MSan and ASan (as they run late in the optimizer pipeline). UBSan meanwhile is relatively resistant to optimizations as its checks are inserted by the frontend.
There are also a few cases where we don’t actually sanitize certain UB but otherwise have an observable crash/hang being legalized. Those also seem relevant for this RFC so I’ll also list them above (with a note) when I file them.
Why does it matter?
So the question is whether these bugs actually affect anyone or it’s just weird edge cases. I ended up scanning the Ubuntu/Debian package repositories to see what kind of sanitizer reports I can trigger after fixing the found optimizer bugs. This was mostly done by static analysis to make the approach scale to all the software shipped by Ubuntu.
The results are that I ended up with after about 3 weeks of scanning software are:
- about 40 bugs found&fixed in actively maintained software that were missed by the optimized sanitized program (8 of them were fixed by someone else between release and when I scanned the package).
- ~100 bugs in unmaintained software in Ubuntu (I didn’t end up filing them as there was no maintainer to contact).
There is also a large backlog of several thousand reports that I still need to verify and report so the numbers above are subject to change.
Potential solutions
From what I can see we have a several ways to fix the issues:
- Tell users to not use optimizations if they care about these issues. Given the severe performance impact of disabling all optimizations this is probably not a viable solution for most users (especially when they are doing fuzzing).
- A partial solution for some of the MSan/ASan issues is to move the sanitizer passes to the beginning of the pipeline. That effectively disables a lot of optimizations and also adds a rather large overhead because of all the additional inserted checks.
- We have a custom optimizer pipeline/mode that tries to preserve sanitizer checks where possible.
An optimizer mode that tries to preserve UB/sanitizer checks
The last option is what I would propose in this RFC. We have a mode for the optimizer that gets automatically activated alongside sanitizers that tries to preserve UB where possible.
There is already a precedent for doing custom passes in combination with sanitizers (MSan is for example adds another EarlyCSEPass instance to the pipeline, the ‘fuzzing’ sanitizer mode adds a custom ‘optforfuzzing’ function attribute used by SimplifyCFG).
I prototyped this idea by selectively disabling the faulty passes (or parts of passes) in a downstream LLVM fork and the performance overhead from doing so seems reasonably low (about 5%, but the numbers might change depending on how we decide to fix certain issues). My downstream fork has some very intrusive changes to LLVM, so I am not sure if it’s feasible to upstream it in its current form.
However, I think we could fix most of the upstream issues via the following changes:
- We automatically disable some passes that extensively ignore/exploit UB that any (activated) sanitizer is supposed to catch. It’s not clear yet what is the full list, but currently I have DSE, ACDE, (IP)SCCP disabled in my LLVM fork .
- We disable some optimization features in Clang/LLVM (e.g., -fno-builtin would fix several issues related to transforming calls to builtins.)
That still misses a few corner cases (e.g., unused loads to potentially out-of-bounds memory are trivially dead right now and removed everywhere, InstCombine sometimes transforms undef to other values), but from what I can see this would get us most of the way to have more reliable sanitizers with optimizations.