Borland-style PCHs in clang

I'm trying to figure out the best way to support Borland style PCHs in
clang. Suppose we have headers x.h and y.h and file x.c
as follows:

    #include "x.h"
    #include "y.h"
    #pragma hdrstop // everything above this line goes into the PCH

In bcc, we would use the command:
    bcc -H -H=x.pch -c x.c

which would have the effect:
    first compile:
  x.pch is created from x.h and y.h
    subsequent compiles:
  x.pch is verified (compile options and defines checked,
  included files and order checked, etc.), then
  parsing begins after #pragma hdrstop

The easiest way to do this using clang would be to
create and invoke the makefile:
    xtmp.h: x.h y.h
  #header file created from all lines of x.c up to #pragma hdrstop
    xtmp.c: x.c
  #source file created from all lines after #pragma hdrstop
    xtmp.pch: xtmp.h
  clang -cc1 xtmp.h -emit-pch -o xtmp.pch
    x.o: xy.pch
  clang -cc1 -include-pch xy.pch xtmp.c
But that has problems:
1. name of files in debug info are wrong.
2. if file y.c includes x.h but not y.h, y.c can't share the PCH info built into xy.pch.

To fix #2, we could use the makefile:
    xtmp.c: x.c
  #source file created from all lines after #pragma hdrstop
    x.pch: x.h
  clang -cc1 x.h -emit-pch -o x.pch
    xy.pch: x.pch x.h
  clang -cc1 -include-pch x.pch -chained-pch y.h -emit-pch -o xy.pch
    x.o: xy.pch
  clang -cc1 -include-pch xy.pch xtmp.c
But this doesn't allow for PCH verification of the first part of x.c (up to the
#pragma hdrstop - the part that had been built into xtmp.h).

... and both of these have the problem of requiring us to do some makefile magic on the fly.

What's the best way to add this support to clang?


I think your best bet is to piggy-back on our preamble support. The main difference between that and #pragma hdrstop is that it finds the end of the includes automatically.


Sounds great. What's your "preamble support"??
May be you can point me to some code in clang that I can poke around


Meh, preamble support is all over the place. I think a key function is Lexer::ComputePreamble, since it's the function whose functionality you have to duplicate and modify for a Borland-style PCH. In general, though, I recommend you do a global case-sensitive search for "Preamble" on the source base and look for interesting names.


Sounds great. What's your "preamble support"??

It's an optimization that we use when we end up repeatedly parsing the same file (e.g., during code completion), which builds a precompiled header up to (and including) the last preprocessor directive at the top of the main file, e.g., for

  #include "MyClass.h"
  #include <vector>
  #include <map>

  void foo() { }

we would build a "precompiled preamble" containing everything up to and including the #include <map>. When we then use that precompiled preamble, we load the precompiled header and then instruct the lexer to start processing at the end of the preamble.

May be you can point me to some code in clang that I can poke around

Lexer::ComputePreamble figures out where the preamble ends; it could be taught to recognize #pragma hdrstop.

Preprocessor::setSkipMainFilePreamble() tells the preprocessor to skip some number of bytes at the beginning of the main file, so that it can resume at the proper place after loading a precompiled preamble.

For most of the rest of the system, a precompiled preamble is just a normal precompiled header. Nothing fancy here.

Set up for the precompiled preamble is a little messy. See lib/Frontend/ASTUnit.cpp, wherever it talks about preambles.

  - Doug

What happens when the PCH preamble is found to be invalid or
out-of-date? Is there any automatic regeneration?