Preprocessing and whitespace collapsing

Hi

We're using to make an indexer daemon which can be queried from emacs
(or other editors if so desired). In doing this we, for various
reasons, preprocess the file first using internal APIs and then invoke
clang_parseTranslationUnit on the preprocessed content. The code for
this is here:

https://github.com/Andersbakken/rtags/blob/multi-process/src/RTagsClang.cpp

Starting at line 450.

The problem we're facing is that clang seems to collapse things like this:

struct A
{
    void foo();
};

into this:

struct A
{
    void foo();
};

int main()
{
    A a;
    a.foo();
}

I realize the standard (at least according to GCC) permits this and
that it of course would speed up parsing of the resulting code ever so
slightly. My question is, is it possible to:

A) Turn this behavior off with an option?
or
B) Someone query which lines had what amounts of space removed.

I assume something can be done since if I let
clang_parseTranslationUnit index it without preprocessing myself
everything works as expected.

From (http://gcc.gnu.org/onlinedocs/gcc-3.3/cpp/Preprocessor-Output.html)

The ISO standard specifies that it is implementation defined whether a
preprocessor preserves whitespace between tokens, or replaces it with
e.g. a single space. In GNU CPP, whitespace between tokens is
collapsed to become a single space, with the exception that the first
token on a non-directive line is preceded with sufficient spaces that
it appears in the same column in the preprocessed output that it
appeared in the original source file. This is so the output is easy to
read. See Differences from previous versions. CPP does not insert any
whitespace where there was none in the original source, except where
necessary to prevent an accidental token paste.

Thanks and kudos for an awesome project.

regards

Anders

Anders~

YouCompleteMe (the vim plugin) is built with a parser and completion engine server and a client process that sits inside vim. Not sure how far your implementation is, but you may wish to consider sharing work it.

https://github.com/Valloric/YouCompleteMe

Matt

Hi

We're using to make an indexer daemon which can be queried from emacs
(or other editors if so desired). In doing this we, for various
reasons, preprocess the file first using internal APIs and then invoke
clang_parseTranslationUnit on the preprocessed content. The code for
this is here:

https://github.com/Andersbakken/rtags/blob/multi-process/src/RTagsClang.cpp

Starting at line 450.

The problem we're facing is that clang seems to collapse things like this:

struct A
{
   void foo();
};

into this:

struct A
{
   void foo();
};

int main()
{
   A a;
   a.foo();
}

I realize the standard (at least according to GCC) permits this and
that it of course would speed up parsing of the resulting code ever so
slightly. My question is, is it possible to:

A) Turn this behavior off with an option?

Yes, with "-traditional-cpp”

-Argyrios

Awesome. Thanks. I'll try it out.

Anders

We're aware of the project. I think his project focuses mostly on
completion whereas ours focuses most on indexing. Certainly there's a
lot of duplicated efforts in both projects but they're both quite
mature and fully functional. I kinda doubt that either project would
want to replace itself with the other one tbh but one could maybe
imagine a world where both projects would work for either editor but
our ipc mechanisms are different enough that I'm not convinced there
would be much savings in terms of workload.

Anders

+val, the author of ycm

Please note that this mode is pretty brittle. -traditional-cpp has a lot of baggage, such as not treating indented “#something” as directives. It’s also not fully compatbile with GCC’s -traditional-cpp, and continuing to work on it is not something we’re really interested in since it can slow down regular preprocessing. (I had to fight with Richard Smith to get a compatibility fix in several months ago, and honestly I tend to agree with him.)

I imagine we’d accept patches to split the whitespace-preserving behavior out from the rest of -traditional-cpp as long as it doesn’t slow down the preprocessor.

Jordan

That would be ideal for us. I'll see if I could manage to whip something up.

Anders