Hacking advice

I need to gather or visit the classes, methods, and functions defined within a give clang compiler instance. Is there a simple interface to accomplish this?

  • Thanks
  • Jeff Kunkel

Hi Jeff,

I'm currently looking at RecursiveASTVisitor.h for this purpose.

Garrison

This depends on what you wish to accomplish. The simplest interface is via libclang, but it doesn't expose all of the details of the AST. Alternatively, you can use the AST visitor or consumer interfaces. The best approach depends on the level of detail that you need.

David

I’ve been studying ASTConsumer and RecursiveASTVisitor. The easiest way to find objects, structs, class, enums, etc is clearly “HandleTagDeclDefinition” from ASTConsumer (see the function documentation). However, I still need to be able to gather global functions or functions local to a namespace.

I am looking at “HandleTopLevelDecl” within the ASTConsumer, but it seems to be geared for data declartion like “int x,y;”. I am not 100 percent sure that it handles function definitions as well.

Second, ASTContext is a very general class, and it is a bear to parse through to find what I need. It seems that I could forgo the methods described above, and I could just use “HandleTranslationUnit” if I wanted to parse through the ASTContext.

RecursiveASTVisitor looks geared for template declarations and instantiations. It does not seem viable for the common function definitions, and it seems inept to handle class, struct, etc instantiations.

Any other advice or comments?

  • Thanks
  • Jeff Kunkel

The overall goal would be
1: parse the names, return types, and parameters of functions with a given context like a namespace, class, struct or global scope.
2: parse classes, structs, enums, etc for the name, internal objects, members, member methods, virtual overrides, inheritance, etc.
3: The final output would be the boost::python C++ code wrapper with as descriptive an interface as the C++ class itself.

Thanks,
Jeff Kunkel

I believe HandleTopLevelDecl, although a little too low level works, since every thing is a Decl
(see http://clang.llvm.org/doxygen/classclang_1_1Decl.html), and from Decls one can get statements
and therefore expressions (see http://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html for example).
Having said this I believe utilization of a visitor is more practical, and therefore use of RecursiveASTVisitor
more viable.

Garrison

Again, I believe RecursiveASTVisitor is for templates more than anything else. Look at the methods defined to be overridden:

bool shouldVisitTemplateInstantiations()

bool TraverseDecl(Decl *D)

bool TraverseNestedNameSpecifier(NestedNameSpecifier *NNS)

bool TraverseTemplateName(TemplateName Template);

bool TraverseTemplateArgument(const TemplateArgument &Arg);

bool TraverseTemplateArgumentLoc(const TemplateArgumentLoc &ArgLoc);

bool TraverseTemplateArguments(const TemplateArgument *Args, unsigned NumArgs);
bool TraverseConstructorInitializer(CXXBaseOrMemberInitializer *Init);

Five out of eight deal with templates. The other ones like constructor initialization does not suite my needs and TraverseDecl can be handled by the ASTConsumer.

The other functions have been macroed to a point of un-readability. Those macros hang out in “clang/AST/*.def” and were probably created by some little language cover all the cases.

Jeff

With the caveat that I’m only starting to look at this, and the knowledge that this was a
recommended way to go from a trusted source, looking at the source you will see methods that are
generated by macros, as you yourself pointed out. So for example VisitParmVarDecl,
will be called every time the visitor walks the nodes in the tree, and subsequently
up the class hierarchy, and sees a ParmVarDecl (if you have not previously stopped the
traversal or class walk). More importantly I believe that in your case you are going to want to
control, when the AST is walked (traversed). Will you for example need to be notified about
every node, or more likely, about every Decl subclass; probably not. Anyway like I’ve implied take this
with a grain of salt, as I have not yet implemented this usage.

Garrison

Not true, unless it's changed significantly since I used it. You only
need to override the methods corresponding to the bits of the AST that
you're interested in. You should be able to just ignore templates if
you don't care about them. IMO it's really what you want for walking
the AST.

Reid

Ok, I will take another look at it. On closer inspection, I see more of the interface I skipped before.

So to gain the function declarations, I would hook into TraverseDecl. Would I use TraverseDecl for object definition too? And how do I know what scope I am in?

I am wondering about libclang. Do you have a working example with libclang somewhere I can look at? It does not need to be a complete tutorial, I just need to see the mechanics behind using clang with libclang.

  • Thanks
    Jeff Kunkel

hi,

The overall goal would be
1: parse the names, return types, and parameters of functions with a given
context like a namespace, class, struct or global scope.
2: parse classes, structs, enums, etc for the name, internal objects,
members, member methods, virtual overrides, inheritance, etc.
3: The final output would be the boost::python C++ code wrapper with as
descriptive an interface as the C++ class itself.

have you looked at the python bindings to libclang and the
cindex-dump.py script ?
http://llvm.org/viewvc/llvm-project/cfe/trunk/bindings/python/
http://llvm.org/viewvc/llvm-project/cfe/trunk/bindings/python/examples/cindex/cindex-dump.py?revision=96107&view=markup

I have been wanting to improve those and do something along the same
lines than yours (but for cython code generation)

recently, I came back at this topic (and submitted a few patches)
my work in progress on improving the libclang python bindings can be
found there:
http://bitbucket.org/binet/py-clang

cheers,
sebastien.

Thank you Sebastien,

I have a feeling that the work you have already done will be a great help!

Jeff Kunkel

I just added a few things.
the clang devs did the "rest".

cheers,
sebastien.