Proposal for a code gen plugin interface

Hi list,

we want to propose adding a plugin interface to clang's CodeGen library.

This interface would allow, e.g., adding custom metadata from clang to the
generated LLVM-IR as well as limited modifications of the generated bitcode.

We have already implemented the interface described in the following link and
we are successfully using it in our own project (http://llbmc.org).

http://baldur.iti.kit.edu/~fmerz/codegen/proposal.html

We expect that other projects might also profit from this interface and thus
suggest to include it into clang. If there is interest from you we'll prepare
a clean patch (currently we have a patch for 3.0) and send it to the list for
reviewing.

Regards,
Florian

LLBMC looks very similar to Address Sanitizer, as far as the memory checking goes at least.

Do you know of Asan, and if so, of the differences ?

– Matthieu

Have you measured the performance impact of your proposed patch in the
simple cases of (1) no plugin being registered or (2) a no-op plugin being
registered? The On{Start,Finish}{Function,Module} functions don't worry
me, but I'm concerned about the plugin being invoked for every single
instruction insertion.

More fundamentally, however, I'm concerned that customizing IR
generation might be an inherently invasive task ill-suited to being
implemented in a plugin, outside of very simple goals like adding
metadata to declarations. For example, your project is probably
customizing loads and stores based on semantic information about
the operation being emitted — but because it's a plugin, it has no idea
why the load or store is being performed, just that it's happening during
the emission of a specific statement.

Hooking instruction insertion seems like something you did because it
was easy to do rather than because it was a good solution.

I would be amenable to a proposal which was focused on making it easy
to do common things like adding metadata to an llvm::GlobalValue based
on the declaration it came from. Otherwise, I would prefer to make
IR-generation easier to directly modify and extend than to support
increasingly baroque methods of external customization, particularly
since every week there seems to be a new project doing memory-safety
instrumentation and they all want to hack IR-gen in their own way.

John.