AST Recursive Visitor- Statements (Stmt *)

Hello Clangers,

I’m new to clang. I’m writing an AST Consumer plug-in to visit the statements node and record the data in one of my table with line numbers. I’ve this function callback ready: VisitStmt(Stmt *S). My question is how could I traverse If, while, for loop, boolean and Unary Operators- inside this function.

Thanks and Regards.

Hi Ayush,

First, you need to know the classes associated with each of your target AST nodes. These are IfStmt, WhileStmt, ForStmt, BinaryOperator, and UnaryOperator. Each of these are sub-classes of Stmt. IfStmt, WhileStmt, ForStmt and direct sub-classes while BinaryOperator and UnaryOperator are sub-classes of Expr, which is a sub-class of ValueStmt, which is a sub-class of Stmt. There’s also two other related classes, CXXForRangeStmt and DoStmt, which represent ranged-based for-loops and do/while loops.

Second, pointers can be changed between classes with the cast and dyn_cast functions and Stmt::getStmtClass() will tell the type of the Stmt. They are used as follows:

void VisitStmt(Stmt *S) {
if (BinaryOperator *BO = dyn_cast(S)) {
// Process BinaryOperator here
} else if (UnaryOperator *UO = dyn_cast(S)) {

} // other checks here

void VisitStmt(Stmt *S) {

switch (S->getStmtClass()) {
case Stmt::BinaryOperatorClass: {
BinaryOperator *BO = cast(S);
// Process BinaryOperator here
case Stmt::UnaryOperatorClass: {
UnaryOperator *UO = cast(S);
// Other cases here

The difference between cast and dyn_cast is that cast expects the pointer is the correct type without checking while dyn_cast does check the target type and returns a null pointer on a type mismatch. Chains of dyn_cast’s are used if the list of nodes is short while using a switch on Stmt::getStmtClass() is used when checking a lot of node types.

There’s also a third way. Since you are already using a visitor, the visitor will have a visit function for each AST node. Instead of writing just VisitStmt, you will write a VisitBinaryOperator(BinaryOperator *), VisitUnaryOperator(UnaryOperator *), and so on for each one you’re interested in. Hope this is enough to get you started.

Adding back the mailing list. Please reply all to keep the discussion on the mailing list.

Thanks Richard for the explanation. Really appreciate it.
One quick question, Within VisitStmt (BinaryOperator) how could I get an access to DeclRefExpr class.

You should be defining a VisitBinaryOperator(BinaryOperator*) function. VisitStmt(BinaryOperator) won’t be called because the base visitor class doesn’t know about it.

For example,
-IfStmt 0x88b5698 <line:13:3, line:16:12>

-BinaryOperator 0x88b54a0 <line:13:7, col:13> ‘int’ ‘==’

-ImplicitCastExpr 0x88b5420 col:7 ‘int’
`-DeclRefExpr 0x88b5358 col:7 ‘int’ lvalue ParmVar 0x88604b0 ‘argc’ ‘int’

BinaryOperator has two methods, getLHS() and getRHS() which get the left-hand side and right-hand side expressions. Given your BinaryOperator, getLHS() will return the ImplicitCastExpr. The Expr class has several methods to remove nodes in the AST. Expr::IgnoreImpCasts() is probably what you want here*. Then you need to check the final Expr if it is DeclRefExpr and use that.

BinaryOperator *BO = …;
Expr *E = BO->getLHS();
E = E->IgnoreImpCasts();
if (DeclRefExpr *DRE = dyn_cast(E)) {
// Do your stuff here.

or just:

BinaryOperator *BO = …;

if (DeclRefExpr *DRE = dyn_cast(E->getLHS()->IgnoreImpCasts())) {

// Do your stuff here.

  • There’s several Expr::Ignore* functions that recursively strips aways different AST nodes from Expr’s. In your examples, IgnoreImpCasts will strip away the LValue to RValue cast, but if there was something like integral cast between different int types, that would stripped away too. If you need more fine-grained control, you’ll need to do the AST traversal yourself.

Thanks Richard for the explanation!

-IfStmt 0x78b6d90 <line:82:13, line:89:13>

-BinaryOperator 0x78b5f08 <line:82:17, col:34> ‘int’ ‘==’

-ImplicitCastExpr 0x78b5eb0 <col:17, col:23> ‘int’
-ImplicitCastExpr 0x78b5e58 <col:17, col:23> 'example_tree':'enum example_tree_type_' <LValueToRValue> -MemberExpr 0x78b5d78 <col:17, col:23> ‘example_tree’:‘enum example_tree_type_’ lvalue → bal 0x75a3ab0
-ImplicitCastExpr 0x78b5d20 <col:17> 'example_tree_node *' <LValueToRValue> -DeclRefExpr 0x78b5cb8 col:17 ‘example_tree_node *’ lvalue Var 0x78b1d48 ‘left’ ‘example_tree_node *’

Is there a way to get an access to the MemberExpr and ImplicitCastExpr from VisitDeclRefExpr.

Thanks for the help!


RecursiveASTVisitor should have an ASTContext available. ASTContext has a getParents function, which may be of some use. Unfortunately, I haven’t used this part of the ASTContext before, so I can’t give any more concrete advice. As you’ve seen, it’s easier to traverse down the AST than up it.

Sure, Not a problem.

What’s the way to get the declarations such as ptr and &bar as it is inside the DeclRefExpr block*.**

void foo(){
int bar=1;
int **ptr;
*ptr = &bar; // this line

If I query like this way:
if (const ValueDecl *VD = dyn_cast(DRE->getDecl())){
OS << VD->getType() //returns the original type of that declaration, not the one that was used.


Thanks and Regards.

The declarations are “ptr” and “bar”. “*ptr” and “&bar” are expressions since * and & are C++ operators. If you want the type of “*ptr” or “&bar”, then you need to get the associated UnaryOperator (a sub-class of Expr) and call getType() on it.