[GSOC] Bug Validation in the Clang Static Analyzer using the Z3 SMT Solver

Bug Validation in the Clang Static Analyzer using the Z3 SMT Solver

Mikhail R. Gadelha
University of Southampton

The full list of commits pushed during this project can be found in:

A detailed version of this report, including a detailed timeline of the
development can be found in:
Summary. The goal of this project is reduce the number of false bugs reported
to the user by the clang static analyzer (CSA), without introducing too much
overhead to the analysis. A new option was added to the CSA,
--crosscheck-with-z3, to refute (or validate) bugs, using the Z3 SMT solver.
The new option works by adding an extra step in the program analysis after the
bug is found by the CSA built-in solver but before reporting it to the user; the
path and the constraints that trigger the bug are encoded in SMT and checked
for feasibility. The new option was evaluated in twelve C/C++ open-source
projects and the empirical results show that, if there are refuted bugs the
analysis is 2.5% faster while only 4.6% slower when no bug is refuted. The bug
refutation is committed into the clang 7.0 source code.

1. Introduction

The Clang project provides a set of tools and infrastructures for  languages in
the C language family (C, C++, Objective C/C++, OpenCL, CUDA, and RenderScript),
including an industrial-grade compiler and a static analyzer.

The analyzer transforms potential execution paths within the program into
systems of equations over unknown values it encounters within the program, and
decides whether a path is feasible by figuring out if the respective system of
equations has at least one solution, using a built-in solver called
RangedConstraintManager. It was designed to be fast, so that it can provide
results for common mistakes (e.g., division by zero or use of uninitialized
variables) even in complex programs. However, the speed comes at the expense of
precision; it cannot handle some arithmetic operations (e.g, remainders) or
bitwise expressions. In these cases the analyzer discards the constraints and
might report false bugs. Consider the following program:

void foo(unsigned width)
  int base;
  int i = 0;

  if (i % width == 0)
    base = 1;

  assert(base == 1);

The program is safe, i.e., the assertion at the end of the function foo always
holds, because the remainder expression that guards the initialization of base
is always false, since i = 0. However, If the analyzer is used to check the
program, the following (wrong) report is presented:

$ clang --analyze main.c
main.c:9:15: warning: The left operand of '==' is a garbage value
  assert(base == 1);
         ~~~~ ^
1 warning generated.

Previous attempts to integrate an SMT solver into the CSA caused the analysis
time to increase up to 20 times compared to the CSA built-in solver. We propose
a new approach, to use the fast and imprecise CSA built-in solver to analyze
the project and find bugs first, then use the slower and more precise SMT
solver to refute (or validate) them; a bug is only reported if both solvers
agree that the bug is feasible.

The new bug refutation option (--crosscheck-with-z3) was implemented and
evaluated by analyzing twelve C/C++ open-source projects of various size (tmux,
redis, openssl, twin, git, postgresql, sqlite3, curl, libWebM, memcached,
xerces-c, and XNU). When analyzing these projects with bug refutation enabled,
the slowdown is negligible in cases where no bug is refuted (average 4.6%
slowdown) and, when it finds false bugs, the bug validation gives a small speed
up (average 2.5% speed up). On average, 11.71 bugs per program were refuted,
ranging from 1 bug refuted in redis to 51 being refuted in XNU.

2. Bug Refutation in the Static Analyzer

The bug refutation using SMT in the static analyzer was implemented in a series
of patches but the main ones were:


We chose to follow an incremental approach for the project by creating a
prototype first and improving the performance later. The first patch, D45517,
implemented the bug refutation as a visitor, executed before reporting a bug.
The bug refutation visitor was executed along with a series of visitor in the
bug report equivalence class (which holds bug reports with similar bugs and
paths), and would simply walk the bug path, encode all constraints and check
for feasibility. This approach was successful in refuting bugs but it was slow;
the analysis time was increased in all the projects and the worst case was
SQLite3, which went up from 1379s to 3264s (2.36x increase).

After the initial prototype, we started to work on optimizations. The first
patch to tackle the problem was D48227, which changed how constraint ranges
were encoded if the lower bound was equal to the upper bound (i.e., it was not a
range but a concrete value). This change improved the solving time by 3% per
query (on average), and decreased the SQLite3 analysis time from 3264s to 2200s
but it was still 1.59x the analysis time without refutation (1379s).

The second optimization patch was D48565, and it changed how the constraints
were collected from the bug path; instead of naively collecting all constraints
in a path (which would include several duplicated constraints), only unique
constraints would be collected and any duplicated constraint would be ignored.
After this patch, the analysis time with bug refutation increased by less than
20% in all projects.

The next patch, D48650, aimed to increase the precision of the bug refutation
visitor. When writing the proposal for this project, we noticed that a number
of complex constraints were being dropped by the static analyzer and an
incomplete set of constraints were reaching the bug refutation visitor. This
patch removed the code that was discarting constraints and adjusted threshold
values for symbolic expression complexity in the symbolic execution engine;
since this changed a core aspect of the static analyzer, several tests were
performed to find the best value for the threshold where the static analyzer
would still report the same bugs in the same amount of time as before. After
this patch, the bug refutation visitor was able to refute even more bugs.

However, as a consequence of having the full set of constraints being encoded,
the bug refutation started to slowdown the analysis time considerably in cases
where the bug report equivalence class had several bug reports and, in
particular, analysis time of redis went from 349s to 1447s (4.14x increase).
Patch D49693 improved that by changing how the bug refutation visitor was
applied to a bug report. Instead of trying to refute the bug reports before
every other visitor could run, the fast visitors would be executed first and,
only if the other visitors marked the bug as valid would the bug refutation
analyze it. This final improvement fixed the slowdown in redis and, to our
surprise, actually made the static analysis faster in a number of cases: when
bugs are removed, the static analyzer completes the analysis faster because it
generates fewer reports.

2.1 Other improvements

During this project, several other improvements were submitted to the clang
static analyzer, including fixes to the Z3 backend (D48324, D49305, D49430,
D49818 and D49769), improvements to the symbolic expression computation (D49093
and D48650) and the development of a generic SMT API in clang, independent of
the static analyzer (D49233, D49236, D49550, D49551 and D49495). In particular,
the generic SMT API makes implementing a new SMT solver in clang much simpler.

3. Results

We evaluated the new bug refutation option in twelve open-source C/C++

1. tmux (v2.7): a terminal multiplexer.
2. redis (v4.0.9): an in-memory database.
3. openSSL (v1.1.1-pre6): a software library for secure communication.
4. twin (v0.8.0): a windowing environment.
5. git (v2.17.0): a version control system.
6. postgreSQL (v10.4): an object-relational database management system.
7. SQLite3 (v3230100): a relational database management system.
8. curl (v7.61.0): command-line tool for transferring data.
9. libWebM (v1.0.0.27): a WebM container library.
10. memcached (v1.5.9): a general-purpose distributed memory caching system.
11. xerces-c++ (v3.2.1): a validating XML parser.
12. XNU (v4570.41.2): an operating system kernel used in Apple products.

Instructions on how to reproduce our results are available in:

We analysed the analysis time and the number of bugs removed for each one of
them, as shown in the following table.

Project    |  Analysis time  | Analysis time  | Number of bugs  |  Number of   |
           > (no refutation) | (+ refutation) | (no refutation) | refuted bugs |
tmux       |           90.35 |          96.28 |              19 |            0 |
redis      |          328.85 |         316.22 |              86 |            1 |
openSSL    |          128.67 |         119.12 |              36 |            2 |
twin       |          205.86 |         207.87 |              52 |            0 |
git        |          384.00 |         356.75 |              69 |           11 |
postgreSQL |         1024.68 |        1074.73 |             184 |            5 |
SQLite3    |         1016.38 |        1057.99 |              82 |           14 |
curl       |           80.68 |          81.56 |              39 |            0 |
libWebM    |           41.15 |          42.10 |              6  |            0 |
memCached  |           89.41 |         100.22 |              25 |            0 |
xerces-C++ |          408.91 |         387.17 |              58 |            2 |
XNU        |         3564.91 |        3464.46 |             441 |           49 |

Out of the 12 analyzed projects, there were refuted bugs in 7 of them: redis,
openssl, git, postgresql, sqlite3, xerces and XNU. On average, 12 bugs were
refuted when analyzing these projects, ranging from 1 bug refuted in redis to
49 refuted in XNU. The average time spent analyzing these projects was 11.42
seconds faster, which represents a 2.5% speed up. This means that, on average,
each removed refuted bug speeds up the analysis time by 2.83 seconds, as the
static analyzer does not need to generate the report for these bugs. On the 5
projects where no bug was refuted (tmux, twin, curl, libWebM and memcached),
the analysis was 4.12 slower on average, which represents a 4.6% slowdown.

In summary, the bug refutation implemented in the clang static analyzer
introduces a minimal overhead in the analysis and in most cases it makes the
analysis faster.

3.1. In Depth Analysis: git

We chose one of the projects and analyzed all the bugs. On git, the Analyzer
finds 69 reports, most of which seem valid but not very useful because
exploiting them requires forging contents of the .git directory. On top of that,
however, it also finds 11 false warnings caused by improper constraint solving,
which we managed to refute.

4. Usage

The bug refutation in the static analyzer is disabled by default and it’s hidden
behind the flag --crosscheck-with-z3. In order to use the flag, an user needs to
build clang with Z3 since the solver is not shipped with the compiler due to
licensing issues.

In order to build clang with Z3, one needs to set four flags when configuring
the project:

* Z3_EXECUTABLE: should point to the Z3 binary (it’s used to check the version).
* Z3_INCLUDE_DIR: should point to the Z3 include directory.
* Z3_LIBRARIES: should point to the Z3 lib (libz3.so).
* CLANG_ANALYZER_BUILD_Z3: to enable the Z3 compilation.

Once the user has a version of clang built with Z3, the bug refutation can be
enabled by passing -analyzer-config crosscheck-with-z3=true when calling the
clang static analyzer.

5. Future Works

The development of a generic SMT API opens the door for new SMT solvers in
clang. The API is minimal and only requires the SMT solver to support
quantifier-free bitvector theories (QF_BV), which includes all the major SMT
solver currently available.

Another future work is a small tool that I developed to reduce programs.
Although there are programs like C-Reduce which considerably reduce programs,
the reducing process can be extremely slow on preprocessed files (3MB+ file is
not uncommon).

The tool I’ve developed can considerably reduce preprocessed files by targeting
unused types and functions in a file. When analyzing a bug report, we are
interested only in the path that triggers the bug, so the tool uses an
attribute already present in the clang AST (isUsed) to, given a target
function, remove any unrelated code. The resulting file is smaller than the
preprocessed file and C-Reduce is able to reduce much faster.

The tool is available in [https://github.com/mikhailramalho/reduce](https://github.com/mikhailramalho/reduce).

6. Acknowledgement

I would like to thank my mentors George Karpenkov and Artem Dergachev for the
guidance during the development of this project and, in particular, George
Karpenkov for the idea for the project. Furthermore, I’d like to thank Devin
Coughlin and Anna Zaks for introducing me to George and Artem, Réka Kovács for
the initial prototype of the bug refutation visitor and, Dominic Chen and Gábor
Horváth for the comments on the patches.