Preprocessor/Parser interaction

For my project(related to source code analysis of C++ program) I need to traverse the clang AST to figure out all the places
where the preprocessor macros are called in the program.

I am fairly new to clang to be honest.

  1. What i want to know is whether the preprocessor gets called before the parser or

it gets called from the parser as each token is seen by it.
2. How can I obtain the AST of a C++ program which has been generated without any preprocessing.
Even if there is some syntax-error in doing so, it is okay.

Thanks.

For my project(related to source code analysis of C++ program) I need to
traverse the clang AST to figure out all the places
where the preprocessor macros are called in the program.
I am fairly new to clang to be honest.

1. What i want to know is whether the preprocessor gets called before the
parser or
it gets called from the parser as each token is seen by it.

The Parser library calls into the Lex library, which turns the input
file into a stream of tokens.

2. How can I obtain the AST of a C++ program which has been generated
without any preprocessing.
Even if there is some syntax-error in doing so, it is okay.

This won't work; don't try it. C++ parsing is way too complicated for
that sort of substitution to result in anything sane.

Depending on what you are doing, you might be able to use source
locations instead; clang very precisely tracks where every expression
is written in the source code, including complete information about
macro expansions.

-Eli

For my project(related to source code analysis of C++ program) I need to traverse the clang AST to figure out all the places
where the preprocessor macros are called in the program.

I am fairly new to clang to be honest.

If you only need information about where macros are used then there’s no need to traverse the AST at all. The PPCallbacks mechanism allows clients to observe the interesting aspects of preprocessing. In particular the MacroExpands callback is invoked every time a macro is used.

Jason

I want to use a PPCallback function (MacroExpands) to track all the locations where
a macro has been invoked in the program.
In the main function, there are a few things that I want to implement, details are
given as comment inside the main function.

Till now what I have done is the following:

namespace clang {

class MyASTAction : public FrontendAction

{

public:

ASTConsumer *CreateASTConsumer(CompilerInstance &CI,

llvm::StringRef InFile)

{ return 0; }

void ExecuteAction() { }

bool usesPreprocessorOnly() const

{ return false; }

};

class TrackMacro : public PPCallbacks

{

public:

void MacroExpands(const Token &MacroNameTok, const MacroInfo* MI,

SourceRange Range)

{

std::cout<<“Macro expands to”<<MacroNameTok.getRawIdentifierData();

}

~TrackMacro(){}

};

}

using namespace clang;
int main(int argc, const char** argv)
{
MyASTAction action;
CompilerInstance Clang;

/*****************************************

How to create an instance of the preprocessor class, it seems like
it takes too many(9) arguments which I was not able to figure out.
Or is there a way to just create a default preprocessor and add my
own call back function to it. And then pass the preprocessor object
to the Parser and invoke the parser just by supplying a filename.

*****************************************/
TrackMacro track_macro;
PP.addPPCallbacks(&track_macro);

Clang.setPreprocessor(PP);

MyASTAction* Act = new MyASTAction();
Clang.ExecuteAction(*Act);

return 0;
}

You don't create the preprocessor instance yourself; you grab it using
getCompilerInstance().getPreprocessor() in your FrontendAction.

-Eli

There’s a huge amount of boilerplate needed to do what you want, and it will still be missing things like #include paths which will render your program unusable for real code.

The dependency graph of objects that need to be created in order to get a Preprocessor is gigantic. I manually tracked it here as I was exploring the APIs myself: http://web.ics.purdue.edu/~silvas/deps.svg . CompilerInstance reduces the amount of boilerplate needed, but there are still crucial things missing for it to be “useful” (see: https://github.com/loarabia/Clang-tutorial/blob/master/CItutorial3.cpp and try including stdio.h in test.c).

A couple weeks ago I went down this path that you are trying to go down, and I can tell you that you are not going to achieve what you want by doing it like this (e.g., as a standalone program using the Clang libs). There are just too many little things that need to be configured in specific ways for the compiler to work “as usual, except also run my (PPCallbacks|ASTConsumer|…)”. My recommendation is to write a Clang plugin, so that you can forget about all those things. There is an example plugin called PrintFunctionNames in examples/PrintFunctionNames that you can look at (although TBH I found it extremely unhelpful for some reason). If you want, I can post on github a minimal “hello world” example for how to run a PPCallbacks.

–Sean Silva

Here is a simple PPCallbacks example:
https://github.com/chisophugis/clang_plugin_example

Let me know if you have any problems building.

–Sean Silva