querying information about preprocessing directives in libclang

libclang defines cursor kinds for various preprocessing directives. However, I can't seem to find any means to access information about those directives themselves.

What I'd like to query is:

* from include directives:
    - the kind of directive (e.g., "include", "include_next")
    - the file name being included, in its symbolic form (the path as written)
    - the actual filename (full path) being resolved.

* from macro definitions:
   - the type of macro (object-like, function-like)
   - the parameter list
   - the macro definition body

* from macro instantiation:
   - a reference to the definition

Is any of this already available through libclang ? If not, is it planned ?

If neither, is there another API that's more suitable to find this information ?

Thanks,
         Stefan

Ping ?

     Stefan

Hi Stefan,

Sorry for the delay in responding. Comments inline.

libclang defines cursor kinds for various preprocessing directives.
However, I can't seem to find any means to access information about
those directives themselves.

What I'd like to query is:

* from include directives:
   - the kind of directive (e.g., "include", "include_next")

This information isn't currently exposed. We have a couple options. We could expose another CXCursorKind, or have an API to query more information from a CXCursor with kind CXCursor_InclusionDirective. I'd prefer the former, but I think it's worth asking what kind of information you'd want to discern here between these two directives.

   - the file name being included, in its symbolic form (the path as
written)
   - the actual filename (full path) being resolved.

clang_getIncludedFile() will return the latter. There is no API to return the former right now, but it could be added. Internally it would likely require relexing, as I don't believe that Clang retains that information in the AST or Preprocessor.

* from macro definitions:
  - the type of macro (object-like, function-like)

This is currently not exposed, but likely could be easy to add.

  - the parameter list

I don't believe there is an API for this (yet). What specifically are you looking for? The parameter list is raw text without semantic meaning until it is instantiated.

  - the macro definition body

This is also raw text with no semantic meaning.

clang_getCursorExtent() should return the full extent (range) for a CXCursor. If you query this on a macro definition cursor (CXCursor_MacroDefinition), does it not provide you the full range? If not, that's likely a bug.

* from macro instantiation:
  - a reference to the definition

clang_getCursorReferenced() will map from a macro instantiation cursor the macro definition.

Is any of this already available through libclang ? If not, is it planned ?

If neither, is there another API that's more suitable to find this
information ?

I think most of this is already exposed, and the rest of it would be straightforward to add. The API has grown incrementally as needs arose. libclang is definitely the right API that we want to generalize for these kind of queries.

For the APIs that don't exist, it's worth filing LLVM bug reports as feature requests so we can track their resolution. If you feel comfortable diving into the implementation of libclang (e.g., CIndex.cpp), it is probably straightforward to implement most of these.

Hi Ted,

What I'd like to query is:

* from include directives:
    - the kind of directive (e.g., "include", "include_next")

This information isn't currently exposed. We have a couple options. We could expose another CXCursorKind, or have an API to query more information from a CXCursor with kind CXCursor_InclusionDirective. I'd prefer the former, but I think it's worth asking what kind of information you'd want to discern here between these two directives.

As I'm working on a documentation (and introspection) tool, I'd like to know anything that users may be interest in knowing. :slight_smile:
In particular, the semantics of the inclusion are important, i.e. anything that affects how the symbolic name maps to actual filenames.
(This includes the distinction between include and include_next, as it includes the distinction between "" and <> inclusion.)

I'd actually think query functions are a better API choice, but I can certainly adjust if you choose to publish the information through a new cursor kind.
(However, I have to admit that I already find the single cursor visitation approach quite limiting, and would like to suggest special-purpose visitors such as "visit function arguments", "visit template parameters", "visit nested declarations", etc., and in this context just adding more cursor kinds seems like the wrong direction to take.)

    - the file name being included, in its symbolic form (the path as
written)
    - the actual filename (full path) being resolved.

clang_getIncludedFile() will return the latter. There is no API to return the former right now, but it could be added. Internally it would likely require relexing, as I don't believe that Clang retains that information in the AST or Preprocessor.

OK. This brings up a related question: What is the suggested API to preprocess, programmatically ? I.e., is there a function akin to clang_parseTranslationUnit, but only for preprocessing, not parsing ? If that function existed, perhaps that would be a better place to plug in callbacks that would report information about preprocessing directives ?

* from macro definitions:
   - the type of macro (object-like, function-like)

This is currently not exposed, but likely could be easy to add.

   - the parameter list

I don't believe there is an API for this (yet). What specifically are you looking for? The parameter list is raw text without semantic meaning until it is instantiated.

With Synopsis I want to treat macro definitions very similar to C/C++ functions. I want to document them as normal members of an API.
The macro definition isn't quite so interesting in this context, but the parameter list definitely is (since inline documentation could be referring to it).

   - the macro definition body

This is also raw text with no semantic meaning.

clang_getCursorExtent() should return the full extent (range) for a CXCursor. If you query this on a macro definition cursor (CXCursor_MacroDefinition), does it not provide you the full range? If not, that's likely a bug.

OK, fair enough. The definition body is probably the least interesting from this all.
(There is a related bug which I filed, even with a patch: http://llvm.org/bugs/show_bug.cgi?id=9069
Since you asked... :slight_smile: )

* from macro instantiation:
   - a reference to the definition

clang_getCursorReferenced() will map from a macro instantiation cursor the macro definition.

Great.

Is any of this already available through libclang ? If not, is it planned ?

If neither, is there another API that's more suitable to find this
information ?

I think most of this is already exposed, and the rest of it would be straightforward to add. The API has grown incrementally as needs arose. libclang is definitely the right API that we want to generalize for these kind of queries.

For the APIs that don't exist, it's worth filing LLVM bug reports as feature requests so we can track their resolution. If you feel comfortable diving into the implementation of libclang (e.g., CIndex.cpp), it is probably straightforward to implement most of these.

OK. I will submit feature requests, and may think about contributions.

Thanks,
         Stefan