AST transformations in a plugin

Hi,

I am new to clang so perhaps this is a beginners question....

I am trying to add some instrumentation code and I want to use a plugin for
that. I use Rewriter to
build new code where the new code consists of pretty printed ASTs with my
own instrumentation
code. Is this the way to do this? The first problem that I encountered is
that the new code is not
passed further down the compilation chain but gets stored in a
RewriteBuffer. Should I somehow
re-parse this code?

Is tooling::RefactoringTool a better method to use than Rewriter? If so, are
there examples available?

Finally a question about plugins. Are they compatible between clang
versions? (I am afraid that they are not).

Regards,
John

If you don’t want to rewrite code (ie. If you have no need for source code to come out of this process), don’t. If you want to instrument code to track various runtime properties like adding extra checks, profiling, etc - do that with IR. See, for example, how the sanitizers (ubsan, msan, as an) are implemented. Either as pure IR transformations, or as extra IR generates by clang

Hi,

For my instrumentation I would like to stay close to the source code so that
I can report with accurate line / column numbers and refer to source level
language constructs. So is this feasible to do in a clang plugin or is
rewriting AST a very complex thing to do?

Examples of what I would like to do is to add a function call before every
loop and adding function calls for every array access.

   for(...)
     ;

becomes:

   here_comes_a_for_loop(...)
   for(...)
      ;

and

  a[b[i]]

becomes:

  a[array_index(..., b[array_index(..., i)]]

Regards,
John

Hi,

For my instrumentation I would like to stay close to the source code so that
I can report with accurate line / column numbers and refer to source level
language constructs.

ubsan does this

So is this feasible to do in a clang plugin or is rewriting AST a very complex thing to do?

It's just going to be more complicated, I think, than modifying IRGen
for your task. The extra indirection of producing source code (that
you have to make correct - finding unique names for any newly
introduced variables, etc) when all you want is new machine code to be
produced, is probably not going to do you any favors.

Examples of what I would like to do is to add a function call before every
loop and adding function calls for every array access.

   for(...)
     ;

becomes:

   here_comes_a_for_loop(...)
   for(...)
      ;

and

  a[b[i]]

becomes:

  a[array_index(..., b[array_index(..., i)]]

Sure, these are certainly things that ubsan does already (well, the
latter certainly - since ubsan does same simple array bounds checking
(this was already in Clang as -fcatch-undefined-behavior prior to the
recent ubsan efforts, I believe))