how to get AST content from CXCursor?

Hi,

I’m traversing an AST by using the CXCursorVisitor. How can I get the node content from the cursor? For example, for a CXCursor_BinaryOperator cursor, I’d like to get the binary operator itself. I guess that the information is included in the CXCursor struct, but I haven’t found examples of how to use it? Any help is appreciated.

I'm not sure if there's a better way but you can get the tokens of the expression. Example, assuming the current cursor is the binary operator:

1. Get the translation unit using clang_Cursor_getTranslationUnit
2. Get the source extent of the cursor using clang_getCursorExtent
3. Tokenize the source extent using clang_tokenize, this will give you an array of tokens
4. For each token:
   1. Get the token kind using clang_getTokenKind
   2. Get the token spelling using clang_getTokenSpelling
5. Find the token with the kind CXToken_Punctuation
6. The spelling for this token will contain the binary operator as a string

I’ve found some functions in the tools/libclang/CXCursor.h file, and they are:

std::pair<OverloadedDeclRefStorage, SourceLocation>
getCursorOverloadedDeclRef(CXCursor C);
``
const Decl *getCursorDecl(CXCursor Cursor);
const Expr *getCursorExpr(CXCursor Cursor);
const Stmt *getCursorStmt(CXCursor Cursor);
const Attr *getCursorAttr(CXCursor Cursor);
const Decl *getCursorParentDecl(CXCursor Cursor);

ASTContext &getCursorContext(CXCursor Cursor);
ASTUnit *getCursorASTUnit(CXCursor Cursor);
CXTranslationUnit getCursorTU(CXCursor Cursor);

void getOverriddenCursors(CXCursor cursor,
SmallVectorImpl<CXCursor> &overridden);

Some of them are implemented as follows:

const Decl *cxcursor::getCursorDecl(CXCursor Cursor) {
return static_cast<const Decl *>(Cursor.data[0]);
}

const Stmt *cxcursor::getCursorStmt(CXCursor Cursor) {
if (Cursor.kind == CXCursor_ObjCSuperClassRef ||
Cursor.kind == CXCursor_ObjCProtocolRef ||
Cursor.kind == CXCursor_ObjCClassRef)
return nullptr;

return static_cast<const Stmt *>(Cursor.data[1]);
}

const Attr *cxcursor::getCursorAttr(CXCursor Cursor) {
return static_cast<const Attr *>(Cursor.data[1]);
}

const Decl *cxcursor::getCursorParentDecl(CXCursor Cursor) {
return static_cast<const Decl *>(Cursor.data[0]);
}

But they are not exported in the libClang. I’m wondering if I can use them in my code, and how? Can anyone give me any hints on when to use data[i], where i = 0, 1, 2? This would be the best way for me to get the contents of a node and do source-to-source translation, if it works.

You're looking at the implementation of libclang, which is in C++. These functions will not be exposed. You should look at CXCursor mostly as an opaque data structure. It's fine to read the "kind" field but you should probably avoid the other two fields ("xdata" and "data").

That’s what I guessed. But my purpose is doing source-to-source translation. I’m not sure if I should use libclang and your previously suggested method to do it, or use libTooling to directly get AST nodes. To me, the second method seems easier to get the content of a node.

Any suggestions? Which way do you prefer if you were translating the source to another language?

I have a a tool that converts C and Objective-C headers to D modules [1] using libclang. It's written in D so libclang is the best choice for me. I suggest you have a look at this page [2] which compares the different ways of using Clang when building tools.

[1] http://github.com/jacob-carlborg/dstep
[2] https://clang.llvm.org/docs/Tooling.html

I would also like to add that if you feel there's something missing in libclang, you can always contribute to add the missing functionality.

My first choice was using libClang after reading the document, but when I realized that the limitation of libClang–unable to get the content of a node easily, my preference shifts to using libTooling.