RFC: Clang Automatic Bug Reporting

Hi all,

It would be great if Clang provided better support for users who want to file
bugs. We frequently have to go through multiple iterations to get all the data
we need, and we also sometimes run into cases where it is very hard / impossible
to reproduce a problem locally. The latter problems tend to show up frequently
with precompiled headers or supplemental files like header maps, where they
depend very much on the build system and the layout of source on the users
system -- such tests can be a real pain to analyze right now.

The following proposal is for new feature work to let Clang automatically
generate bug reports.

Goals

Hi all,

It would be great if Clang provided better support for users who want to file
bugs. We frequently have to go through multiple iterations to get all the data
we need, and we also sometimes run into cases where it is very hard / impossible
to reproduce a problem locally. The latter problems tend to show up frequently
with precompiled headers or supplemental files like header maps, where they
depend very much on the build system and the layout of source on the users
system -- such tests can be a real pain to analyze right now.

The following proposal is for new feature work to let Clang automatically
generate bug reports.

Goals

Frontend / Single-File Focused:

The goal of this work is to support generating bug reports for parse failures,
crashes, and trivial miscompiles. It is not designed to support generating
test cases where a large application is miscompiled, for example. Generally,
it is designed to support the case where the user runs a single Clang command
on their system, it doesn't work (crashes, produces obviously invalid output,
etc.), and they want a Clang developer to be able to reproduce the problem.

Easy-to-use:

We want people to use it, so it has to be simple and it has to work almost all
the time.

Near-perfect Bug Reproduction:

We want it to be almost guaranteed that the generated bug report reproduces
the problem. It isn't possible to be perfect, but we would like to get very
close.

Report Non-Compiler API Bugs:

Currently, bugs in the compiler are usually easy to reproduce for users who
know how to generate preprocessed or LLVM IR files. However, bugs in
other areas of Clang like the libclang interfaces are much harder to
reproduce. Any solution should address (or help address) this problem.

Support auto-minimizing / anonymizing test cases:

This won't happen soon, but I would like any solution to support this in some
reasonable fashion. This is primarily a nice to have, but it is also important
because it makes it more likely users will actually bother to submit a test
case in situations where they are worried about disclosing their source code.

User Interface

The Clang driver will get a two new options:

'--create-test-case PATH'

  This will cause the driver to create a self-contained test case at PATH,
  which contains enough information to reproduce all the actions the compiler
  is taking as much as possible.

'--replay-test-case PATH'

  This will cause the driver to replay the test case as best as possible. The
  driver will still support additional command line options, so the usual use
  model would be to run '--replay-test-case' to verify the problem reproduces,
  then either fix the problem directly or use additional command line options
  (-E, -###, -emit-llvm, etc.) to isolate / minimize the problem.

At some point, we'll want to support automatic creation of test cases when Clang crashes (e.g., by installing the appropriate signal handler). However, the --create-test-case command-line option will be a huge improvement even before then.

Implementation

Conceptually, what we want to capture in the test case is as much of the users
environment as is required to reproduce the problem. The environment consists of
a lot of things which might change the compilers behavior: the OS, the hardware,
the file system, the environment variables, the command line options, the
behavior of external programs, etc. We obviously cannot package up all of these
things, but Clang is portable and always a cross compiler, and most bugs can be
reproduced on different hardware or a different OS (with the right options).

The implementation is to try and capture each piece of the environment as best
we can:

- For the OS and hardware, we will just record the OS and CPU information, and
  when replaying the test case we will use that information instead of the host
  information. This will require a few additional hooks, but should be
  straightforward.

This is mainly encoded in the -cc1 command-line options, no?

- Command line arguments and environment variables can just be saved to the
  test case and restored on replay.

- For external programs the driver calls like 'as' and 'ld', all we can expect
  to do in general is store the version information for the program, so that
  developers can at least try to replicate the host environment if necessary
  (and if the failure actually depends on the particular version of one of
  those tools, which it usually doesn't).

- The file system is the main piece we cannot currently deal with. Usually we
  have users give us a preprocessed files to avoid depending on the users file
  system, but this does not always suffice to reproduce problems.

  My plan here is to rework parts of Clang to add support for a "virtual file
  system" which would live under the FileManager API layer. When the driver is
  generating test cases, it would use this interface to keep track of all the
  directories and files that are accessed as part of the compilation, and it
  would serialize all this information (the file metadata and contents) into
  the bug report. When the driver is replaying a test case, it would construct
  a new virtual file system (like a private chroot, essentially) from the bug
  report. This is the main implementation work, described below.

I love the idea of handling this through the virtual file system.

There will be lots more details to be sorted out, but I wanted to give a heads
up on the basic approach I am planning on taking, assuming I can find the time
to work on this. Comments appreciated!

Looks great to me. I'll likely have more comments as the implementation details start getting ironed out.

  - Doug

Sounds great to me!

This interface sounds like it might also be useful for a distcc like
implementation.

This is something I am very interested in working on and have thought
of before. The basic plan was to first figure out which part of clang
was crashing, and then use the information in the layers above that to
guide the delta algorithm. This part could actually be implemented now
as either part of the driver, or as a separate tool and be useful
right away.

As for the rest of the proposal; I love it.

- Michael Spencer

and eventually could use a Clang based delta tool to minimize the input source.

This is something I am very interested in working on and have thought
of before. The basic plan was to first figure out which part of clang
was crashing, and then use the information in the layers above that to
guide the delta algorithm. This part could actually be implemented now
as either part of the driver, or as a separate tool and be useful
right away.

Yes, definitely, this is a very worthy project. I have an
implementation of the delta algorithm ready and waiting in llvm/ADT,
and have bits and pieces of a Clang based delta lying around on my
hard drive, but nothing concrete enough to be worth checking in. If
someone else wanted to work on this it would make me very happy. :slight_smile:

- Daniel

Cool, could you post patches of the work you have so far? Or put it up
on your github account against earl's mirror?

- Michael Spencer

Hi all,

It would be great if Clang provided better support for users who want to file
bugs. We frequently have to go through multiple iterations to get all the data
we need, and we also sometimes run into cases where it is very hard / impossible
to reproduce a problem locally. The latter problems tend to show up frequently
with precompiled headers or supplemental files like header maps, where they
depend very much on the build system and the layout of source on the users
system -- such tests can be a real pain to analyze right now.

I have done a fair amount of work on generating and reducing clang bugs. As time goes on, the bugs found are involving more complex code. I think a good system like you describe would be a good idea. The VFS seems like a good idea. I will just say a few words on automatic reduction. The short version is "I wouldn't try it, at least as a first pass", although anonymisation is a good idea.

Automatic reduction can be very expensive. A bug involving boost can often lead to megabytes of pre-processed code, and > 10 second compile times. In this situation, any kind of testcase reduction can bisection can take a day or longer. We should certainly warn users before we start running such a process!

I don't know how much better a clang-based system could do. I would suggest concentrating on anonymizing rather than auto-minimizing test cases, although anonymization would probably involve some degree of reduction. I think the important thing for users is that it is fast, and removes as many details of their code as possible. Rather than worrying about it being particularlysmaller.

I imagine that quite a lot of progress could be made on anonymisation by a certainly degree of random shuffling, for example randomly swapping < for > or == (allowing for overloaded operators of course), adding and removing constants, changing names. This wouldn't always work, but would be quick, and likely make the code extremely hard to reconstruct.

For reduction I use a cobbled-together bunch of scripts. These are fairly successful, if fairly expensive. One problem with automatic testcase reduction is trying to keep the code valid, if it was originally, is usually a good idea. I do this by using g++, which at least produces an approximation of "valid" :wink:

The main thing I find my code cannot reduce well (as it does not understand C++ is)

1) Constructions involving things like enable_if,
2) Removing parameters to functions and templates which are not used (which can be a big help if those parameters are some very complex type), as they have to be removed at both the definition and call sites at the same time.

This of course ignores one big class of bugs, "wrong result" bugs (as opposed to the compiler misbehaving). Reducing such bugs is extremely difficult. The best I have done is simultaneously running the reduced code through g++ and clang++, and running the resulting executables valgrind (to try to avoid the code happening to produce the correct result while reading from uninitialized memory). Even with this, such reductions usually end up in an incorrect local minima if attempted automatically.

Hey Daniel,

Hi Collin,

As I said, there really isn't anything coherent worth checking in that
isn't already there. I can check in a basic skeleton, but we are
talking about a trivial amount of code.

I'll assume you have seen:
1. include/llvm/ADT/DeltaAlgorithm.h and include/llvm/ADT/DAGDeltaAlgorithm.h
2. clang/utils/token-delta.py

The first cut at a clang based delta I was planning on was:
1. Run the clang lexer to dice the input source(s) into raw tokens.
All minimization is done based on tokens.
    - This is already what token-delta.py is doing, although it is a hack.
    - By itself, this will be unusable slow, compared to the standard
'multidelta' tool.
2. Build an approximate location based DAG:
    - Use the libclang APIs to traverse the AST and extract all range
information.
    - Construct the DAG by treating any token/range as depending on
the other ranges it overlaps. Ignoring overlapped ranges, this will
give us a tree structure over all the input tokens.
3. Minimize using DAGDeltaAlgorithm.

The basic idea here is that this subsumes multidelta by generating
minimization items ("changes", in the parlance of the original paper)
for any "interesting structure" with has an AST element with a correct
range. Multidelta effectively only could make minimization items for
compound blocks.

All in all this shouldn't be too hard, it is just various bits of
gluing. The current code I have doesn't do anything interesting, the
token-delta.py is already more sophisticated than it.

The other "piece" I have lying around is an CIL based minimization
tool (in OCaML) which constructed a dependency graph using actually
source dependencies. For example, functions would get edges for the
types they depended on. This has the potential to minimize much more
efficiently, but it is also more fragile. The code is horrible, and I
don't feel like sharing it, I'd rather reimplement it on top of Clang
one day. :slight_smile:

- Daniel