[preprocessor] How to customize preprocessor directive handling

I am writing a clang-based preprocessor tool. Its goal is, given a source file and preprocessor state (i.e. input macros and include paths), to find a set of headers which are needed to successfully compile that source file. I need this tool to be as fast as possible. (I am writing something similar to distcc's include server)

I managed to get this working rather easily using PPCallbacks. However, even though clang's preprocessor is the fastest I found around, I could use more speed. As my tool will generally process multiple source files at once, I started building a cache to short-circuit an #include directive in case header has previously been processed. So far so good.

The problem is that there is no way to customize how preprocessor processes #include directive, so I am unable to use cached results. In case of a cache hit I would like to update preprocessor with macro definitions from cache in order to avoid lexing the entire header once again.

I tried various approaches and concluded that this currently cannot be done.

I ended up with adding another event to clang::PPCallbacks. It is called before PPCallbacks::InclusionDirective, and allows me to manually set clang::FileEntry to use when #include directive is found. This way I can generate a virtual FileEntry with custom contents in case of a cache hit, or use default behavior as a fallback. With this I got everything working, but unfortunately I had to patch Clang.

I would like to ask - is it possible to make Clang support this scenario out-of-the-box? I attached a patch which does the job for me. I'm pretty sure the patch will raise an eyebrow on every clang developer, I included it merely as an illustration of what I need.

Thank you in advance for any help on this issue.

ppcallbacks.patch (2.49 KB)