For one thing, I want to perform syntax highlighting on the header - including it would make that vastly more complicated because I'd have to create two translation units per header and track them independently.
I'm not certain why this would be vastly more complicated, and why you would need two translation units per header, etc. I'm not interested in arguing with you on this point; you likely have your own design requirements that I'm not aware of.
I'm not sure what you mean by 'less meaning' either. There is no semantic difference between a header and a one-line source file that just includes that header, as a compilation unit. Neither C, C++, nor Objective-C makes any differentiation between source files and header files - they're purely programmer a convention, as is the file extension given to source files, so libclang should not really be treating them differently. If someone chooses to name their source files .cplusplus or .objectivec then this should work as well, as long as a language is specified (although, going back to my original point, libclang apparently refuses to accept -x flags, and I don't know why).
Sorry David, my point was too terse. I am fully aware that there is no semantic difference from the parser's perspective. My point was more that there may be a semantic difference because headers can have different semantic meaning depending on the context on how they are used.
For example, consider the header, "iostream":
How do we know how to interpret this header? Since it is included in a C++ translation unit, the compiler interprets the text as C++ code, but in the absence of this context the compiler has no knowledge of why this file should be interpreted in this way. Thus context is critical to establishing the semantics of the header file.
C++ aside, consider a header file that looks like this:
Suppose FOO is defined by the including source file, or by *another header* that includes that header. Sometimes headers aren't fully self-contained; they are intended to be used in the context of other headers. One can argue whether that is good or bad practice, but it does happen. Thus context matters here as well; in this case, it can change what is actually valid source code defined by the header, and what isn't.
Another example is a header that can be included by both an Objective-C or Objective-C++ source file. In one case it is Objective-C code, and in the other case it is Objective-C++ code. The difference can really matter in some cases.
Ultimately, no source file, whether it is a header or a vanilla .c file, has any intrinsic semantics until is parsed, in the full proper context, by the compiler. That context includes all the -I and -D flags, etc. Since headers are inherently tied to the preprocessor, they can be used in all sorts of ways, and so ultimately their semantics are determined by the file that includes the header. That said, you can often skirt the issue with approximations, but if you are interested in replicating the semantics of a header that is seen by the actual developer in their project, headers need to be analyzed in the context of how they are actually used.