Hey LLVM-dev,
We propose to build the CSI framework to provide a comprehensive suite of compiler-inserted instrumentation hooks that dynamic-analysis tools can use to observe and investigate program runtime behavior. Traditionally, tools based on compiler instrumentation would each separately modify the compiler to insert their own instrumentation. In contrast, CSI inserts a standard collection of instrumentation hooks into the program-under-test. Each CSI-tool is implemented as a library that defines relevant hooks, and the remaining hooks are “nulled” out and elided during link-time optimization (LTO), resulting in instrumented runtimes on par with custom instrumentation. CSI allows many compiler-based tools to be written as simple libraries without modifying the compiler, greatly lowering the bar for
developing dynamic-analysis tools.
================
Motivation
Key to understanding and improving the behavior of any system is visibility – the ability to know what is going on inside the system. Various dynamic-analysis tools, such as race detectors, memory checkers, cache simulators, call-graph generators, code-coverage analyzers, and performance profilers, rely on compiler instrumentation to gain visibility into the program behaviors during execution. With this approach, the tool writer modifies the compiler to insert instrumentation code into the program-under-test so that it can execute behind the scene while the program-under-test runs. This approach, however, means that the development of new tools requires compiler work, which many potential tool writers are ill equipped to do, and thus raises the bar for building new and innovative tools.
The goal of the CSI framework is to provide comprehensive static instrumentation through the compiler, in order to simplify the task of building efficient and effective platform-independent tools. The CSI framework allows the tool writer to easily develop analysis tools that require
compiler instrumentation without needing to understand the compiler internals or modifying the compiler, which greatly lowers the bar for developing dynamic-analysis tools.
================
Approach
The CSI framework inserts instrumentation hooks at salient locations throughout the compiled code of a program-under-test, such as function entry and exit points, basic-block entry and exit points, before and after each memory operation, etc. Tool writers can instrument a program-under-test simply by first writing a library that defines the semantics of relevant hooks
and then statically linking their compiled library with the program-under-test.
At first glance, this brute-force method of inserting hooks at every salient location in the program-under-test seems to be replete with overheads. CSI overcomes these overheads through the use of link-time-optimization (LTO), which is now readily available in most major compilers, including GCC and LLVM. Using LTO, instrumentation hooks that are not used by a particular tool can be elided, allowing the overheads of these hooks to be avoided when the
I don’t understand this flow: the front-end emits all the possible instrumentation but the useless calls to the runtime will be removed during the link?
It means that the final binary is specialized for a given tool right? What is the advantage of generating this useless instrumentation in the first place then? I’m missing a piece here…
instrumented program-under-test is run. Furthermore, LTO can optimize a tool’s instrumentation within a program using traditional compiler optimizations. Our initial study indicates that the use of LTO does not unduly slow down the build time
This is a false claim: LTO has a very large overhead, and especially is not parallel, so the more core you have the more the difference will be. We frequently observes builds that are 3 times slower. Moreover, LTO is not incremental friendly and during debug (which is very relevant with sanitizer) rebuilding involves waiting for the full link to occur again.