Manipulating/transforming ASTs for instrumentation


I need to instrument C++ programs to keep track of some variables of interest and branches taken. I tried to write a clang plugin that modifies the AST to insert calls but the AST I generate seems to break code generation later on. I want to work on the AST level because some of the relationship to the source code constructs are lost when working on a later stage of the compilation pipeline, e.g. LLVM bitcode.

What is the most reliable way to inspect the AST and modify the program (e.g. to remove, add, or change some statements)? The approaches I found are below but I couldn’t figure out which would be the best option for implementation:

  • Using libtooling and Rewriter to re-write the source code. This approach injects raw text into the program, which seemed error prone to me.

  • Writing a TreeTransformer in Sema to recursively build a new AST with the intended changes. This seems like the best option right now but it requires modifying the compiler rather than “just writing a plugin”, is that correct?

  • Modifying CodeGen to emit extra/different code on program points of interest. This seems like the Sema approach, with the difference of creating LLVM instructions rather than AST nodes.

A code snippet showing how I am trying to manipulate the AST is below, in case it is of any help and in case there is a way to do it with the plugin infrastructure: