Format of AST entries?

I am attempting to do some basic parsing of the AST using C#. So far I’ve had success extracting a few things simply by writing some test C++ code, generating an AST for it, making some changes, regenerating an AST, and observing what changes in the AST entries. However, this empirical approach is pretty brutal and requires a lot of questionable assumptions. Is there any documentation that explicitly explains the possible permutations and field meanings of each AST entry type? If so I’d greatly appreciate a link to it. I’ve done pretty well figuring out the FunctionDecl item and a few others but a document describing them would really be helpful. For example, in the following I know that ‘ABCDE’ represents the name of the function and ‘float (int, char)’ represents its data type, but there are two copies of ‘float (int, char)’ and I’d like to know why and if they both represent the same thing:

`-DeclRefExpr 0x862c118 col:11 ‘float (int, char)’ lvalue Function 0x85deae0 ‘ABCDE’ ‘float (int, char)’

I have similar questions about most of the entry types.
Thanks,
Ray

Hi Ray,

First off, understand that the AST dumper provides a human readable view of the AST that may be missing information and can change over time. It’s good for a quick look into Clang’s AST, but there are better options for tooling to work on the AST. (https://clang.llvm.org/docs/Tooling.html)

(https://clang.llvm.org/docs/IntroductionToTheClangAST.html) This is an introduction to the AST, which has links to the Doxygen generated documentation for the AST nodes. Stmt, Decl, and Type are good places to start.

For your specific question, you have a DeclRefExpr and a FunctionDecl.

DeclRefExpr (https://clang.llvm.org/doxygen/classclang_1_1DeclRefExpr.html)
DeclRefExpr is a sub class of Expr which is a sub class of Stmt

FunctionDecl (http://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html)

FunctionDecl->DeclaratorDecl->ValueDecl->NamedDecl->Decl

The AST dumper code is in lib/AST/ASTDumper.cpp. To dump a DeclRefExpr node, it first dumps the Stmt portion, followed by the Expr portion, then DeclRefExpr portion. Stmt prints “DeclRefExpr 0x862c118 col:11” which is the specific Stmt kind, the pointer address, and the code location. Expr prints “‘float (int, char)’ lvalue” which is the Type and value type. DeclRefExpr is pretty much just a holder for a Decl, so it dumps the Decl. For all Decl’s, it prints the Decl kind and the pointer address. Since FunctionDecl is a NamedDecl, it prints the name next. And finally, since it is a ValueDecl, it also prints the type.

So, the first ‘float (int, char)’ is the type of the DeclRefExpr while the second ‘float (int, char)’ is the type of the FunctionDecl. There can be some conversions between the Decl type and the Expr type, so they may not always be the same.

Hope that helps.