Custom Clang Code Transformation Addon


I am attempting to write a code transformation tool for clang and was hoping for some help and advice.

The general idea is to take c code marked up with custom pragma statements similar to openmp markup, and extract the basic block immediately following. A new c++ class that extends another already existing class will then be generated inside the compiler with the basic block added as a method inside that class, along with some administration code etc.

My current plan for doing this is as follows:

  1. In the preprocessor create a new PragmaHandler that inserts an annotation token marking the start location of the block to be extracted (and a few other tokens for parameters/config etc.)

  2. During parsing and ast construction, extract the block following any marker encountered and generate the new class directly into the AST.

  3. Insert instances of this class globally and then replace the original block with calls to use the new objects.

  4. Emit everything as llvm bytecode so that I can do further modifications and optimisations in an llvm pass.

My first question is, does this sound sane?

I have no problems with implementing the first stage, however I am hitting humps during the second. Using the already existing “unused pragma” as a base, I have the parser appropriately recognising my new pragma tokens, which calls my HandlePragma method inside the Parser object. This then consumes my custom tokens and calls my ActOnPragma method inside the Sema object. At this stage I am unsure of how to proceed. I could attempt to generate my new class inside this action (Using CXXRecordDecls I believe), or I could insert another marker such as a custom statement here, and then attempt to generate the new class inside of an ASTConsumer or elsewhere. Or should I be using a BuildXXX method to generate my new class as suggested by the Clang Internals document (struggling with finding examples of doing so)? Do you have any suggestions or advice on this part?

Finally, the actual generation of the new class I am having difficulties with and struggling with documentation for doing so (unsurprising given the whole c++ side of clang appears to still be in flux). I am able to generate a new CXXRecordDecl object inside my ActOnPragma method, and print it out for analysis fine, (so my class exists but hasn’t been inserted into the AST just yet). However when I attempt to extend my class using an already existing base class, or add in methods and variables etc into my class nothing appears to happen. Do you have any advice, documentation, examples of generating additional AST nodes on the fly like this? I have been rifling through the doxygen and source code and whatever else I can find up to this point.

Many thanks in advance for your help.

For those interested, I am trying to make an automated loop-level paralleliser using speculative execution where necessary. It is based around a simplified version of [1] using loops extracted potentially unsafely using [2].

Thanks again,


[1] Software thread-level speculation: an optimistic library implementation. C. Oancea and A. Mycroft.
[2] Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping. G. Tournavitis, et al.

It sounds like the closest existing thing in clang to what you are
trying to do is LambdaExpr (which represents C++11 lambda
expressions); reading the relevant code for that would be a decent
place to start.