Getting preprocessed text between two source location.

dear cfe list,

I need to get the preprocessed text between two source locations in order to
Lex again and check for code proprieties (e.g., in a const int declaration see
if it was actually `int const' or `const int')

This was my idea:

1- using the source manager I get the spelling location from the begin
source location and end source location.

2- I get the character data from both spelling locations.

3- Being begin and end the two pointers, I have the text in the [begin,
end[ range.

But it does not work. While in presence of macro the result seems
unpredictable, sometimes I even get segfaults.


I am really stuck with this issue, there is really no way to get the
unadorned text between two source locations?


Take a look at the token rewriter. lib/Rewrite/TokenRewriter.cpp


Here it is the code that can help u:

ASTContext* ctx = ...;

const LangOptions &LangOpts = pp.getLangOptions();
TokenRewriter Rewriter(ctx->getSourceManager().getMainFileID(), ctx->getSourceManager(), LangOpts);

// Print out the output.
for (TokenRewriter::token_iterator I = Rewriter.token_begin(), E = Rewriter.token_end(); I != E; ++I)
    out << pp.getSpelling(*I);

this piece of code return the text in all the source file, but you can easily check if the location of the current token is inside or outside the region you are interested in.

just rewrite in this way:
for (TokenRewriter::token_iterator I = Rewriter.token_begin(), E = Rewriter.token_end(); I != E; ++I)
    if(ctx->getSourceManager().getInstantiationLineNumber((*I).getLocation()) > location ) ... or whatever you need to check
        out << pp.getSpelling(*I);

and it should work! It works for me! But beware that the TokenRewriter prints out the source file in the original version, if you modify the AST... the TokenRewriter will continue to show the original tree. In order to print out the modified AST you have to use the printPretty() method in the Stmt class.

cheers, Simone

Paolo Bolzoni wrote: