Clang Plugin: Is it possible to insert instrumentation code in the Clang AST?

Hi All,
Quick disclaimer, I am a newbie to the Clang project so this question might seem stupid. It is also similar to the following thread, which didn’t really answer the question and is also a few years old at this point.

I am currently in the process of writing a small clang plugin with the purpose of inserting some instrumentation code into annotated functions. I loosely followed the clang plugin guide and the tutorials on AST Matchers, tools and transformers.

So far, I have managed to make the modifications I want and write them back to the original source file using a Rewriter, however this is not quite what I want as the changes are not applied in the invocation of the compiler in which my plugin is running.

As far as I can tell, there would be two ways how to achieve what I want: either modifying the AST in my plugin or making using a rewriter and then reparsing the source code (if possible without actually persisting the changes).

I think that the AST is supposed to be pretty much immutable, and all of the discussions I read so far confirm this, so I think it would not be a good Idea to modify the AST myself, however the thread linked above mentions the TreeTransform class, however so far I have not managed to get that to work, and I’m also not sure if it is possible/wise to do this from a plugin. There is, however, this presentation which uses them. My main problem here is that I didn’t really find any Documentation on how to use those tools, and if this is even possible from a plugin. If anybody thinks this would be possible I would be very gracious for some guidance how to get that to work.

The other approach is mentioned in this stackoverflow post, which would be to add the modified sources in the buffer to the remapped files of the PreprocessorOptions, however, the post mentions that this would also not be possible from a plugin.

Does anybody know of any way to get either of these approaches, or something which achieves the same goal, to work using the current plugin system?

Thank you very much!

1 Like

The answer is as usual it depends on what you want to do in terms of performance, maintainability and project goals. Perhaps using the rewriting system will get you going quicker than anything else.

The Clang AST is immutable but could be mutated if one knows their way. That approach requires a lot of engineering skills, it is painful and more challenging to maintain and develop. If your project intends to be a long lived one with finer grained requirements to what needs to be done in subtle to distinguish cases the AST approach is the way to go.

You can’t easily randomly modify the AST but you can relatively easily get Clang to create new top-level declarations coming from generated AST nodes without a source represenation. This is pretty much how to C++ template instantiation system works. If you want to insert annotation code in existing systems it depends how the code looks like but generally, you can make your plugin “see” the AST before CodeGen sees it and implement the “HandleTopLevelDecl” interface in the plugin adding a few extra statements where necessary…

Hey, Thanks!

Unfortunately I cant quite follow…
My current approach would have been (and for source rewriting this also works) to replace the RHS of certain BinaryOperator nodes.

Could you maybe tell me a bit more or point me towards some documentation how the approach you sketched would work? In particular, how would I “run” my AST matchers in HandleTopLevelDecl

Assuming your plugin follows the structure of PrintFunctionNames demo: llvm-project/clang/examples/PrintFunctionNames/PrintFunctionNames.cpp at 7e2eeb5753dee9719054a0a9c2315a82a2afbf32 · llvm/llvm-project · GitHub You can add your logic in that function…