Instrumentation for unit testing

Hi,

I am a PhD student and as part of my research I'd like to ease testing
of legacy code.

There are many legacy enterprise applications that were written
without automated unit tests. It is often very difficult to maintain
and modify such code, since we cannot verify the changes. A frequently
used approach in this situation is to write additional tests without
modifying the original source code. There are several techniques to do
this [1]. However, these techniques have their own limitations and
disadvantages [1,2].

My aim is to provide an alternative technique without those
limitations. I used the Clang compiler sanitizer infrastructure to
implement testing specific instrumentation (prototype). The
instrumentation makes it possible to replace any C/C++ function with a
corresponding test double function.

[1] finstrument_mock/README.md at master · martong/finstrument_mock · GitHub
[2] https://accu.org/journals/overload/20/108/ruegg_1927/

Here are a few motivating examples:

### Replace template functions
    // unit_under_test.hpp
    template <typename T>
    T FunTemp(T t) {
        return t;
    }
    inline int foo(int p) {
        return FunTemp(p);
    }

    // test.cpp
    TEST_F(FooFixture, CallFunT) {
        SUBSTITUTE(&FunTemp<int>, &fake_FunTemp);
        int p = 13;
        auto res = foo(p);
        EXPECT_EQ(res, 39);
    }

### Replace functions in class templates
    // unit_under_test.hpp
    template <typename T>
    struct TemplateS {
        int foo(int p) { return bar(p); }
        int bar(int p) { return p; }
    };

    // test.cpp
    int fake_bar_mem_fun(TemplateS<int>* self, int p) { return p * 3; }
    TEST_F(FooFixture, ClassT) {
        SUBSTITUTE(&TemplateS<int>::bar, &fake_bar_mem_fun);
        TemplateS<int> t;
        auto res = t.foo(13);
        EXPECT_EQ(res, 39);
    }

### Replace (always inline) functions in STL
Consider the following concurrent `Entity`:
    // Entity.hpp
    class Entity {
    public:
        int process(int i) const;
        void add(int i);
    private:
        std::vector<int> v;
        mutable std::mutex m;
    };

    // Entity.cpp
    int Entity::process(int i) const {
        std::unique_lock<std::mutex> lock{m, std::try_to_lock};
        if (lock.owns_lock()) {
            auto result = std::accumulate(v.begin(), v.end(), i);
            return result;
        } else {
            return -1;
        }
        return 0;
    }
    void Entity::add(int i) {
        std::lock_guard<std::mutex> lock{m};
        v.push_back(i);
    }
We can test the behaviour based on whether the lock is already owned
by another thread or not:
    // test.cpp
    #include "Entity.hpp"

    bool owns_lock_result;
    using Lock = std::unique_lock<std::mutex>;
    bool fake_owns_lock(Lock*) { return owns_lock_result; }

    TEST_F(FooFixture, MutexTest) {
        SUBSTITUTE(&Lock::owns_lock, &fake_owns_lock);
        Entity e;
        owns_lock_result = false;
        EXPECT_EQ(e.process(1), -1);
        owns_lock_result = true;
        EXPECT_EQ(e.process(1), 1);
    }

You can find other motivating examples about
- Replacing calls in libc - fopen(), fread()
- Replacing system calls - time()
- Eliminating death tests, replace `[[noreturn]]` functions
at finstrument_mock/README.md at master · martong/finstrument_mock · GitHub

How does it work?
The idea is to replace each and every function call expression with
the following pseudo code (let's suppose, the callee is foo):
    char* funptr = __fake_hook(&foo);
    auto ret = result_of(&foo);
    if (funptr) {
        ret = funptr(args...);
    } else {
        ret = foo(args...);
    }
    return ret;
`__fake_hook` is defined in a runtime library, which shall be linked
after we switched on this compiler option. The result of `__fake_hook`
can be set runtime, with the `SUBSTITUTE` macro.
I had to refactor `CodeGenFunction::EmitCall` in order to be able to
modify the generated IR for the call expressions.
You can find more implementations details (difficulties with virtuals,
performance, future work, etc) here:

Also the diff of the patched compiler is here:

I'd like to ask from the Clang community
- whether you find this as a useful contribution,
- would you like to integrate such experimental instrumentation to the
Clang tree?
If the answer is yes then what are the next steps (review, open questions, etc)?

Thanks,
Gabor