AST node creation


I am trying to understand about ASTs in clang by stepping through the code. I am using a program with “for loop”.
Can someone tell me better places to put breakpoints which would help me understand that particular part of clang.


Hi Kalyan,

If you want to see where various AST nodes are created, you can try putting breakpoints in many of the ‘Create()’ static member functions that occur in many of the subclasses of Stmt, or search for the ‘new (ASTContext)’ form that is used to create ASTs using the allocator associated with the ASTContext object. For example:

$ grep -r “[)] ForStmt” *
lib/Frontend/PCHReaderStmt.cpp: S = new (Context) ForStmt(Empty);
lib/Sema/SemaStmt.cpp: return Owned(new (Context) ForStmt(First, Second, ConditionVar, Third, Body,

libSema is the semantic analyzer, so that’s where a ForStmt will get created during regular compilation. The AST is defined in libAST, with the important header files being Stmt.h, Decl.h, and their derivatives.


To see the AST, you can use the -ast-print command line option. You can get a "binary" representation by setting Dump in ASTPrinter. I couldn't figure out how to do this from the command line (is there a way?), so I just hacked the ASTPrinter::HandleTranslationUnit to always set Policy.Dump to true.

clang -cc1 ~/t.c -ast-print


Hi Ted and Martin, I tried both ways. Thanks.

Can you or anyone else explain me in a short paragraph what goes on internally when some code like the one below gets compiled (concentrating on AST node creation rather than Lexer side). Even after knowing what all classes take part in this process I cannot get a good understanding of the internal process of Clang in creating nodes. I know what all classes the code touches in the process, so I will understand it when someone explains the process in terms of the keywords used for the classes or pointer variables that Clang/LLVM infrastructure uses.

for(i=0; i<16; i++)
a[i] = b[i] + 5;

This will help me a lot. Thanks.