[RFC] llvm-bisectd: a bisection daemon for supporting bisection with parallel builds

Hi all,

I’d just like to draw attention to some patch reviews for a new tool I’m proposing, llvm-bisectd: https://reviews.llvm.org/D113030

There’s documentation in the Bisection.md file in the patch, which I’ll just paste here for convenience.

# Bisection with llvm-bisectd

## Introduction

The `llvm-bisectd` tool allows LLVM developers to rapidly bisect miscompiles in
clang or other tools running in parallel. This document explains how the tool
works and how to leverage it for bisecting your own specific issues.

Bisection as a general debugging technique can be done in multiple ways. We can
bisect across the *time* dimension, which usually means that we're bisecting
commits made to LLVM. We could instead bisect across the dimension of the LLVM
codebase itself, disabling some optimizations and leaving others enabled, to
narrow down the configuration that reproduces the issue. We can also bisect in
the dimension of the target program being compiled, e.g. compiling some parts
with a known good configuration to narrow down the problematic location in the
program. The `llvm-bisectd` tool is intended to help with this last approach to
debugging: finding the place where a bug is introduced. It does so with the aim
of being minimally intrusive to the build system of the target program.

## High level design

The bisection process with `llvm-bisectd` uses a client/server model, where all
the state about the bisection is maintained by the `llvm-bisectd` daemon. The
compilation tools (e.g. clang) send requests and get responses back telling
them what to do. As a developer, debugging using this methodology is intended
to be simple, with the daemon taking care of most of the complexity.

### Bisection keys

This process relies on a user-defined key that's used to represent a particular
action being done at a unique place in the target program's build. The key is a
string to allow the most flexibility of data representation. `llvm-bisectd`
doesn't care what the meaning of the key is, as long as has the following
1. The key maps onto a specific place in the source program in a stable manner.
    Even if the software is being built with multiple compilers running
    concurrently, the key should not be affected.
2. Between one build of the target software and the next (clean) build, the
    same set of keys should be generated exactly.

For our example of bisecting a novel optimization pass, a good choice of key
would be the module + function name of the target program being compiled. The
function name meets requirement 1. because each module + function string refers
to a unique place in the target program. (A module may not have two functions
with the same symbol name). The inclusion of the module name in the key helps
to disambiguate two local linkage functions with the same name in two different
translation units. The key also satisfies requirement 2. because the function
names are static between one build and the next (e.g. no random auto-generation
going on).

## Bisection workflow

The bisection process has two stages. The first is called the *learning* stage,
and the second is the main *bisection* stage. The purpose of the learning
stage is for the bisection daemon to *learn* about all the keys that will be
bisected through during each bisection round.

The first thing that needs to be done is that `llvm-bisectd` needs to be
started as a daemon.

$ llvm-bisectd
bisectd > _

On start, `llvm-bisectd` initializes into the learning phase, so nothing else
needs to be done.

Then, the software project being debugged is built with the client tools like
clang having the bisection mode enabled. This can be a compiler flag or some
other mechanism. For example, to bisect GlobalISel across target functions,
we can pass `-mllvm -gisel-bisect-selection` to clang.

During the first build of the project, the client tools are sending a bisection
request to `llvm-bisectd` for each key. `llvm-bisectd` in the learning phase
just replies to the clients with the answer "YES". In the background, it's
storing each unique key it receives into a vector for later.

### Bisection phase

After the first build is done, the learning phase is over, and `llvm-bisectd`
should know about all the keys that will be requested in future builds.
We can start the bisection phase now by using the `start-bisection` command in
the `llvm-bisectd` command interpreter.

bisectd > start-bisect
Starting bisection with 17306 total keys to search
bisectd > _

We're now in the bisection phase. Now, we perform the following actions in a
repeatedly until `llvm-bisectd` terminates with an answer.
1. Do a clean build of the project (with the bisection flags as before)
2. Test the resulting build to see if it still exhibits the bug.
3. If the bug remains, then we type the command `bad` into the `llvm-bisectd`
    interpreter. If the bug has disappeared, we type the `good` command instead.

And that's it! Eventually the bisection will finish and `llvm-bisectd` will
print the *key* that, when enabled, triggers the bug.

Bisection completed. Failing key was: /work/testing/llvm-test-suite/CTMark/tramp3d-v4/tramp3d-v4.cpp _ZN17MultiArgEvaluatorI16MainEvaluatorTagE13createIterateI9MultiArg3I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10BrickViewUESC_SC_EN4Adv51Z13MomentumfluxZILi3EEELi3E15EvaluateLocLoopISH_Li3EEEEvRKT_RKT0_RK8IntervalIXT1_EER14ScalarCodeInfoIXT1_EXsrSK_4sizeEERKT2_

## Adding bisection support in clients

Adding support for bisecting a new type of action is simple. The client only
needs to generate a key at the point where bisection is needed, and then use
client utilities in `lib/Support/RemoteBisectorClient.cpp` to talk to the
daemon. For example, if the bisection is to done for a `FunctionPass`
optimization, then one place to add the code would be to the `runOnFunction()`
method, using the function name as a key.

bool runOnFunction(Function &F) {
  // ...
  if (EnableBisectForNewOptimization) {
    std::string Key = F.getParent()->getSourceFileName() + " "
                        + F.getName().str();
    RemoteBisectClient BisectClient;
    if (!BisectClient.shouldPerformAction(Key))
      return false; // Bisector daemon told us to skip this action.
  // Continue with the optimization
  // ...


Nifty! Look forward to seeing how that shakes out/is built upon/etc.

For the new pass manager, see StandardInstrumentations.cpp, which already does pass skipping based on opt-bisect and optnone.