Hi!
I’m making an automated tracing software with gdb and clang.
I’m using clang specifically cindex to figure out what stuff is on any specific line.
I’m trying to go trough children of the first cursor on each line, which in most cases are functions variables etc.
I’d like to parse these but Idk what each kind of cursor means, as well as some cursors don’t seem to have spelling which makes a bit harder to wrap my head around.
Here’s an example of what the output looks like: (the lines are children with kind and spelling)
pinaplua/pinaple/paged.c 124 paged_getLast
while (nptr != NULL) {
+ TokenKind.KEYWORD while
CursorKind.UNEXPOSED_EXPR
CursorKind.COMPOUND_STMT
CursorKind.BINARY_OPERATOR
CursorKind.BINARY_OPERATOR
CursorKind.DECL_REF_EXPR nptr
CursorKind.UNEXPOSED_EXPR
CursorKind.DECL_REF_EXPR nptr
CursorKind.DECL_REF_EXPR ptr
CursorKind.UNEXPOSED_EXPR nptr
CursorKind.DECL_REF_EXPR nptr
pinaplua/pinaple/paged.c 125 paged_getLast
ptr = nptr;
+ TokenKind.IDENTIFIER ptr
pinaplua/pinaple/paged.c 126 paged_getLast
nptr = nptr->next;
+ TokenKind.IDENTIFIER nptr
My goal is to figure out parameters of functions and variables so I can ask gdb for their address and build a map of all memory operations.
1 Like
I think my approach here should be different, I need to find all the top most cursors in the tree on the line.
so for expression like this (i know horrible code but it illustrates a point):
int a = 5; int b = a + 3;
the top cursors on the line would be both =
.
I haven’t deciphered the kinds of cursors yet, but it seems that unexpised expression is what points to the actual parameters of the function.
also any hint on where i could find more info on cursors would be appreciated, I have not found anything besides the sourcecode and a brief description in the python api.
@lemonjumps I’m not sure I understand what exactly you want to do.
A function call is identified by CursorKind.CALL_EXPR
, and a function definition by CursorKind.FUNCTION_DECL
, and these have the respective function name as their spelling. That’s how you find functions.
Parameters passed to a CALL_EXPR
can have several different kinds, e.g. INTEGER_LITERAL
if you pass an integer as literal, but they can also be more complicated. E.g. passing an integer variable, would result in an UNEXPOSED_EXPR
with a child DECL_REF_EXPR
, both of which have the name of the variable as their spelling. Not sure how many more cases are there tbh, cindex gives you a view into what the compiler sees and I’m not that familiar with compiler’s inner workings myself. As a result, the best way to find out is probably to just write C++ code covering all the cases you care about, and then see what the Python bindings produce in those cases.
If you care about the parameter names in the FUNCTION_DECL
, those should have kind PARM_DECL
.
On a side note, the Python bindings are not complete, they’re missing some features and may be slightly buggy in other places. So don’t be surprised if you find them to not make sense sometimes… happy to take a look if you find any bugs or similar though.
1 Like
Thanks for the help! That’s basically the explanation I was looking for.
I’ve figured out that the python interface is old (looks at 8yo todos in the code lmao)
To be honest, while I’d love to help out on reviving the python interface, for now, I gave up, and decided to make my own parser/AST.
I’ve chosen a different approach from what seems to be the standard for compilers, and make it super simple.
so the AST nodes are only like, variable creation/destruction, variable usage, function call, function/type declaration. so even built in operators are just function nodes without anything special.
I could yap on about what I’m doing, but i feel like that’s probably beyond the scope of this forum (since it’s not about llvm anymore).
PS: if you know of any place where ppl talk about all kinds of compilers, I’d be happy to join! 
Good luck with that! I’ve been in a similar situation before, but found parsing stuff myself to be too much effort so I’ve tried hard to work around the Python bindings’ limitations lol. And after that started contributing to them, so hopefully they’ll be in a better place a year from now…
regarding PS: can’t really help with that unfortunately… you could join the LLVM Discord specifically, but dunno if that gives you what you want.
1 Like