Range for CXFile

Hello all,

in the clang-c API, one can find the range for the top-level file of a
translation unit via

auto cursor = clang_getTranslationUnitCursor(unit);
auto range = clang_getCursorExtent(cursor);

Cool. But, now, I have a CXFile representing a file that was included at some
point within the TU. How do I get its range?

Or maybe there is a different way: I want to access all comments in any CXFile
encountered in a translation unit. Currently, we tokenize the top-level
document and then look for the CXToken_Comment tokens and get their spelling
and range.

I cannot find a way to do the same for included files. Help would be
appreciated.

Thanks!

There's a comment introspection API that might be of help: http://clang.llvm.org/doxygen/group__CINDEX__COMMENT.html

It won't work. Think about something like this:

// TODO: rewrite this file

/**
* @file fubar
*/

/**
* this does bar
*/
void bar();

if I understand the docs about the Documentation API correctly, I will only
get access to the very last comment, as only stuff that "represents a
documentable entity" can be passed to clang_Cursor_getParsedComment. Or am I
missing something?

Bye

That might be correct, I didn't look so closely.

Your original question that your approach doesn't work for included files. If you visits all nodes in the tree shouldn't you eventually hit the nodes from the included files? In the tool I have that uses libclang I need to explicitly filter out nodes from included files.

Alternatively you could just parse the included files separately.

> It won't work. Think about something like this:
>
> // TODO: rewrite this file
>
> /**
>
> * @file fubar
> */
>
> /**
>
> * this does bar
> */
>
> void bar();
>
> if I understand the docs about the Documentation API correctly, I will
> only
> get access to the very last comment, as only stuff that "represents a
> documentable entity" can be passed to clang_Cursor_getParsedComment. Or am
> I missing something?

That might be correct, I didn't look so closely.

Your original question that your approach doesn't work for included
files. If you visits all nodes in the tree shouldn't you eventually hit
the nodes from the included files? In the tool I have that uses libclang
I need to explicitly filter out nodes from included files.

Yes, for nodes/cursors that is true. But comments don't have a cursor, do
they? So I need to tokenize the include files separately. Which is apparently
not possible currently - if I understand you correctly?

Alternatively you could just parse the included files separately.

This is too time consuming. We explicitly don't want to run clang N times, if
once would suffice.

Bye

Yes, for nodes/cursors that is true. But comments don't have a cursor, do
they? So I need to tokenize the include files separately. Which is apparently
not possible currently - if I understand you correctly?

Ok, I see the problem now. Yeah, it doesn't look like comments will have a cursor. I really don't know, I'm not so familar with libclang tokenize API.

I found this [1], don't know if it's related.

This is too time consuming. We explicitly don't want to run clang N times, if
once would suffice.

I see. Sorry, I don't think I can help you.

[1] https://github.com/llvm-mirror/clang/blob/da5bcf310ea7bff05a15b8650034774016bea7a3/tools/libclang/CIndex.cpp#L5350-L5352

It just says that I cannot pass a range that spans multiple files. I don't
want/need that. Rather, I want to find the range for an included file and pass
that to the tokenizer. Then, the CXFile / FileID should be equal for the begin
and end range, and the tokenizer should work - or so I hope. I just have
trouble constructing the range for a given CXFile... :slight_smile:

Anyhow, maybe someone else has an idea how to do that. Constructing the start
cursor for a file is trivial. Maybe I should try to create a huge end cursor
and see whether the tokenizer can work with that...

Cheers