relating preprocessing information to syntax trees

Hello,

I'm working on a small path that dumps clang ASTs in a prolog term representation. The format is closely related to ROSE [1] syntax trees (there's a one-to-one correspondence). One of the goals is source-to-source translation. Therefore, preprocessing information has to be properly annotated at the immediately succeeding declaration.

I'm thus looking for a proper way to relate preprocessing information such as define or include directives and pragmas to nodes in the syntax tree. A quick hack would be to use the annotated source locations, but I don't particularly like this solution. Any advice would be greatly appreciated. Sorry if this is a trivial question, I'm fairly new to clang.

Best,

Hi Dietmar,

I'm not sure exactly what you are asking for. Instead of a general question, can you ask about a specific feature? For example, macro expansions are tracked explicitly through source locations. One client of this is the diagnostics subsystem. You can see examples that show this working at the bottom of this page:
http://clang.llvm.org/diagnostics.html

-Chris

Hi,

I'm thus looking for a proper way to relate preprocessing information
such as define or include directives and pragmas to nodes in the syntax
tree. A quick hack would be to use the annotated source locations, but I
don't particularly like this solution. Any advice would be greatly
appreciated. Sorry if this is a trivial question, I'm fairly new to
clang.

I'm not sure exactly what you are asking for. Instead of a general
question, can you ask about a specific feature?

Sure. My pass is implemented as an ASTConsumer by processing declarations passed to HandleTopLevelDecl(). I'm looking for a way to annotate declarations with preprocessor directives that are located immediately before them in the original source code, e.g., for
     #include "foo.h"
     void foo() {}
I would like to create a term that looks roughly like this:
     function_declaration(
         function_parameter_list(, <...>),
         function_definition(basic_block(), <...>),
         function_declaration_annotation(
             function_type(type_void, <...>),
             foo,
             declaration_modifier(<...>),
   --> preprocessing_info(
   --> [cpreprocessorIncludeDeclaration(
   --> '#include "foo.h"\n',
   --> before,
   --> file_info('foo.c',1,1)
                  )])))

I'm at a point where the AST is available, i.e., the lexer and the parser already did their job. I had a quick look at some examples such as the HTMLRewriter and they seem to instantiate their own lexer and process the token stream. I was wondering if, given an AST, there is a way to query the preprocessor directives already processed so far, e.g., can I find out that there was the include directive for foo.h when processing the declaration for foo()?

Thanks!

One relatively straight-forward way to implement this is to implement PPCallbacks when parsing the file. This interface gets notified when a #include or #define is seen. It can just push directives seen onto a list. When your ASTConsumer sees a decl, it would just associate all directives in the list with the declaration it just saw.

-Chris

One relatively straight-forward way to implement this is to implement
PPCallbacks when parsing the file. This interface gets notified when a
#include or #define is seen. It can just push directives seen onto a
list. When your ASTConsumer sees a decl, it would just associate all
directives in the list with the declaration it just saw.

I did as you suggested and it seems to work fine.

Thanks!