Preprocessor lexer sometimes goes too far?

Hello, cfe-dev!

Here's an issue:

I have a program with subclasses of ASTConsumer and PPCallbacks, which
are passed to ParseAST().
See the expansions.c attachment which I pass as input to it.

Sometimes I see that the macro expansion callbacks (and, respectively,
macro expansion actions) are happening way before the respective code
gets processed. (See gdb output below)

How can this behaviour be explained? Is this some kind of optimization
side effect or something?

My program relies on the expansion order so this is critical to me to
gather relations between expanded macros and the declarations they
expanded.

(gdb) i b
Num Type Disp Enb Address What
2 breakpoint keep y 0x0a59b839 in clang::Preprocessor::EnterMacro(clang::Token&, clang::SourceLocation, clang::MacroArgs*) at PPLexerChange.cpp:146
3 breakpoint keep y 0x0a59b19f in clang::Preprocessor::HandleEndOfTokenLexer(clang::Token&) at PPLexerChange.cpp:269
5 breakpoint keep n 0x0a23385c in clang::Parser::ParseDeclGroup(clang::Parser::ParsingDeclSpec&, unsigned int, bool, clang::SourceLocation*)
                                       at ParseDecl.cpp:418

135 main(int argc, char **argv)
(gdb) c
Continuing.
processing: tests/expansions.c
preprocessing: tests/expansions.c
defined macro: LIST_ENTRY
defined macro: DLIST
defined macro: DFUN
expansion: LIST_ENTRY

Breakpoint 2, clang::Preprocessor::EnterMacro (this=0x7f946a00, Tok=@0xcfbc6afc, ILEnd={ID = 209}, Args=0x7d0fe200) at PPLexerChange.cpp:146
146 PushIncludeMacroStack();
(gdb)
Continuing.

Breakpoint 3, clang::Preprocessor::HandleEndOfTokenLexer (this=0x7f946a00, Result=@0xcfbc6afc) at PPLexerChange.cpp:269 269 assert(CurTokenLexer && !CurPPLexer &&
(gdb)
Continuing.
tests/expansions.c:6:8 HandleTagDeclDefinition Record
                field: p type: void * Pointer
                field: d1_next type: struct data1 * Pointer
tests/expansions.c:6:8 HandleTopLevelDecl Record
tests/expansions.c:11:8 HandleTagDeclDefinition Record
                field: data type: int Builtin
expansion: DLIST

Here, the DLIST expansion goes ahead the HandleTopLevelDecl
ParseTopLevelDecl() has not returned here yet. why?

Breakpoint 2, clang::Preprocessor::EnterMacro (this=0x7f946a00, Tok=@0xcfbc6afc, ILEnd={ID = 256}, Args=0x7d0fea80) at PPLexerChange.cpp:146
146 PushIncludeMacroStack();
(gdb) c
Continuing.
tests/expansions.c:11:8 HandleTopLevelDecl Record
tests/expansions.c:15:1 <Spelling=<scratch space>:4:1> HandleTopLevelDecl Var
        global:struct data1 a2

Breakpoint 3, clang::Preprocessor::HandleEndOfTokenLexer (this=0x7f946a00, Result=@0xcfbc6afc) at PPLexerChange.cpp:269
269 assert(CurTokenLexer && !CurPPLexer &&
(gdb) c
Continuing.
expansion: DLIST

Breakpoint 2, clang::Preprocessor::EnterMacro (this=0x7f946a00, Tok=@0xcfbc6afc, ILEnd={ID = 269}, Args=0x7d0fe200) at PPLexerChange.cpp:146
146 PushIncludeMacroStack();
(gdb) c
Continuing.
tests/expansions.c:15:1 <Spelling=<scratch space>:4:1> HandleTopLevelDecl Var
        global:struct data2 a1
tests/expansions.c:16:1 <Spelling=<scratch space>:4:1> HandleTopLevelDecl Var
        global:struct data2 a1

Breakpoint 3, clang::Preprocessor::HandleEndOfTokenLexer (this=0x7f946a00, Result=@0xcfbc6afc) at PPLexerChange.cpp:269
269 assert(CurTokenLexer && !CurPPLexer &&
(gdb) c
Continuing.
expansion: DFUN

Breakpoint 2, clang::Preprocessor::EnterMacro (this=0x7f946a00, Tok=@0xcfbc6afc, ILEnd={ID = 280}, Args=0x7d0fea80) at PPLexerChange.cpp:146
146 PushIncludeMacroStack();
(gdb) c
Continuing.
tests/expansions.c:16:1 <Spelling=<scratch space>:4:1> HandleTopLevelDecl Var
        global:struct data1 a2

Here, the second DFUN expansion goes ahead the second struct
declaration.

Breakpoint 3, clang::Preprocessor::HandleEndOfTokenLexer (this=0x7f946a00, Result=@0xcfbc6afc) at PPLexerChange.cpp:269
269 assert(CurTokenLexer && !CurPPLexer &&
(gdb) c
Continuing.
expansion: DFUN

Breakpoint 2, clang::Preprocessor::EnterMacro (this=0x7f946a00, Tok=@0xcfbc6afc, ILEnd={ID = 289}, Args=0x7d0fe200) at PPLexerChange.cpp:146
146 PushIncludeMacroStack();

expansions.c (290 Bytes)

Hello, cfe-dev!

Here's an issue:

I have a program with subclasses of ASTConsumer and PPCallbacks, which
are passed to ParseAST().
See the expansions.c attachment which I pass as input to it.

Sometimes I see that the macro expansion callbacks (and, respectively,
macro expansion actions) are happening way before the respective code
gets processed. (See gdb output below)

How can this behaviour be explained? Is this some kind of optimization
side effect or something?

The lexer is going to run ahead of the parser while it's expanding macros or when the parser is performing lookahead.

My program relies on the expansion order so this is critical to me to
gather relations between expanded macros and the declarations they
expanded.

Why not go back to look at the source locations of the resulting declarations, then check whether they came from a macro instantiation?

  - Doug