Is there a way to write a custom IR code generator with plugin or something like that?

Hello,

My group is currently working on enhancing the fault-tolerance scheme of Clang’s IR code generator. From what I understand, a typical Clang front-end plugin runs alongside the existing front-end routines. In our case, we need to add additional IR instructions on top of the existing code generator’s output.

Ideally, we would like to perform our modifications as a separate pass after the code generation phase. However, our code generator is currently tightly integrated with some portions of the existing code. And it seems to be quite difficult to make it as a separate pass because the location of addition instruction is discovered when code generation happens. Most of our changes are within the methods of the CodeGenFunction class.

To ensure future maintainability, we want to separate our modifications from the main code generation logic. One possible approach that comes to mind is replacing the original methods of the CodeGenFunction class with our own methods, or even replacing the entire CodeGenFunction class with our custom implementation. However, these methods involve modifying the mainline source code directly, which can be considered as source code hacking.

Considering this, we are wondering if there is a way to utilize Clang’s plugin mechanism to achieve our goal. We are looking for a cleaner and more modular solution that aligns with Clang’s architecture and maintains compatibility with future updates.

Thank you for your assistance.

David

If you need modified versions of individual routines in CodeGenFunction, there isn’t really any way around modifying the code. CodeGen is not designed to provide hooks for individual expressions. And I’m not sure what such an interface would look like if we want it; there’s no central place where every expression/decl/etc. goes through.

If you can run your code as a pure LLVM IR to IR transform, you can add that as a plugin, but from your description it sounds like that isn’t sufficient?

Thank you for your prompt response.

In our case, when generating additional IR instructions, it is essential for the CodeGenerator to have access to specific expression information, such as variable names, types, and whether the variables are on the left-hand side (LHS) or right-hand side (RHS).

Based on my understanding, during an IR-to-IR transformation, this level of detailed information is not preserved. Therefore, I believe that an IR-to-IR transformation may not be suitable for achieving our objectives in this scenario.

Could there be a better solution than direct hacking on CodeGenFunction source code?

David

No, I don’t have any other ideas.

If you have some suggestion for what such an interface could look like, we could consider adding something.

Thank you for your opinion.
I’ll think about if there is an interface to achieve the goal.

David