Getting the class name for a category in libclang

Hi,

I'm trying to build an index using libclang, but there doesn't appear to be any way of getting the class name for a cursor representing an Objective-C category, other than to generate and then parse the USR. Am I missing something, or is this missing functionality?

If it's missing, does anyone have any thoughts about the correct way to expose this? My initial expectation was that the semantic parent for a category should be the class, but there are a number of cases where this quickly becomes nonintuitive. Should there be a dedicated function for getting the class name from a category declaration?

David

-- Sent from my Difference Engine

Hi,

I'm trying to build an index using libclang, but there doesn't appear to be any way of getting the class name for a cursor representing an Objective-C category, other than to generate and then parse the USR. Am I missing something, or is this missing functionality?

Hi David,

Is clang_getCursorSpelling() not sufficient for this case?

FYI, the USRs should be viewed as opaque strings. You should never parse them. They are designed to be changed at any time for purposes of efficiency, etc. They are really designed to just be unique keys.

If it's missing, does anyone have any thoughts about the correct way to expose this? My initial expectation was that the semantic parent for a category should be the class, but there are a number of cases where this quickly becomes nonintuitive. Should there be a dedicated function for getting the class name from a category declaration?

I don't believe we need a special API for this. I believe that all of the information is there. If you find that it is cumbersome, however, we can evaluate whether it makes sense to add additional APIs.

Hi,

I'm trying to build an index using libclang, but there doesn't appear to be any way of getting the class name for a cursor representing an Objective-C category, other than to generate and then parse the USR. Am I missing something, or is this missing functionality?

Hi David,

Is clang_getCursorSpelling() not sufficient for this case?

Nope, this returns the name of the category. For example, in the case of:

@implementation Foo (Bar)

This returns Bar. I see no API for accessing Foo.

FYI, the USRs should be viewed as opaque strings. You should never parse them. They are designed to be changed at any time for purposes of efficiency, etc. They are really designed to just be unique keys.

Fair enough.

If it's missing, does anyone have any thoughts about the correct way to expose this? My initial expectation was that the semantic parent for a category should be the class, but there are a number of cases where this quickly becomes nonintuitive. Should there be a dedicated function for getting the class name from a category declaration?

I don't believe we need a special API for this. I believe that all of the information is there. If you find that it is cumbersome, however, we can evaluate whether it makes sense to add additional APIs.

Presumably something similar is required for navigating class hierarchies, in both C++ and Objective-C; for example, I don't see an API for getting the superclass of either. I would expect something similar to the semantic parent API.

David

-- Sent from my PDP-11

Ah, I see.

When visiting the children of that cursor, I believe you should see a class reference to the @interface for 'Foo'. You should be able to query the class name that way. Does this work? It's a little indirect, and possibly worthy of a wrapper API.

This sometimes works. If there is an @implementation decl for the class visible, then there is an ObjCClassRef child. If there is not, however, then the only children are the methods. In the underlying APIs, it would simply be a call to ->getClassInterface() to get the interface.

There also doesn't seem to be a way of walking the class hierarchy; I was expecting to be able to find the superclass for a class interface decl, but the only way I can see for doing this is to find the token and then parse the source code myself, which rather defeats the point of using libclang...

I'm happy to spend some time implementing these, but we should probably agree on what the interfaces should look like first. I would propose something like this:

// Returns the cursor for the class @interface when passed a category @interface or @implementation cursor, invalid cursor otherwise.

CXCursor clang_getObjCCategoryClassInterface(CXCursor);

// Returns the number of superclasses of the given class (always returns 0 or 1 for ObjC classes, may return more for C++). Returns 0 if the cursor is not a valid ObjC or C++ class interface or implementation decl

unsigned clang_getNumberOfSuperclasses(CXCursor);

// Returns a cursor for the interface to the nth superclass of the cursor indicated by the first argument.

CXCursor clang_getSuperclass(CXCursor, unsigned);

David

-- Sent from my Difference Engine

I'm not seeing this. This is what I see using c-index-test:

$ cat t.m
@interface Foo @end
@protocol Bar @end
@interface Foo (Bar) @end

$ c-index-test -test-load-source all t.m | grep -v invalid
// CHECK: t.m:1:12: ObjCInterfaceDecl=Foo:1:12 Extent=[1:1 - 1:20]
// CHECK: t.m:2:1: ObjCProtocolDecl=Bar:2:1 (Definition) Extent=[2:1 - 2:19]
// CHECK: t.m:3:12: ObjCCategoryDecl=Bar:3:12 Extent=[3:1 - 3:26]
// CHECK: t.m:3:12: ObjCClassRef=Foo:1:12 Extent=[3:12 - 3:15]

The last part of the reference from the category to the @interface.

That is not the same as what I said. I have:

@interface Foo @end
@implementation Foo (Bar)
- (void)bar {}
@end

Here, I am only able to get from the category implementation to the class interface if I also have a class implementation in the same compilation unit. If I add this, then it works:

@implementation Foo @end

David

There also doesn't seem to be a way of walking the class hierarchy; I was expecting to be able to find the superclass for a class interface decl, but the only way I can see for doing this is to find the token and then parse the source code myself, which rather defeats the point of using libclang...

I can be done using the current APIs. Tools have been built on this API that care very much about the Objective-C type hierarchy.

I'm happy to spend some time implementing these, but we should probably agree on what the interfaces should look like first. I would propose something like this:

// Returns the cursor for the class @interface when passed a category @interface or @implementation cursor, invalid cursor otherwise.

CXCursor clang_getObjCCategoryClassInterface(CXCursor);

This seems reasonable, although I'm not certain it is strictly necessary. That said, I don't have strong feelings either way. It certainly is clear on what it does.

// Returns the number of superclasses of the given class (always returns 0 or 1 for ObjC classes, may return more for C++). Returns 0 if the cursor is not a valid ObjC or C++ class interface or implementation decl

unsigned clang_getNumberOfSuperclasses(CXCursor);

// Returns a cursor for the interface to the nth superclass of the cursor indicated by the first argument.

CXCursor clang_getSuperclass(CXCursor, unsigned);

Since these APIs are specific to Objective-C cursors, I'd prefer to include 'ObjC' in the class names. How about:

  unsigned clang_getNumObjCSuperClasses(CXCursor C);
  CXCursor clang_getObjCSuperClass(CXCursor C, unsigned idx);

That matches the current coding style.

I think this is a succinct API; the only thing I'm mixed on is that this really isn't exposing more raw functionality in the library. We've been trying to keep the API surface parsimonious.

There also doesn't seem to be a way of walking the class hierarchy; I was expecting to be able to find the superclass for a class interface decl, but the only way I can see for doing this is to find the token and then parse the source code myself, which rather defeats the point of using libclang...

I can be done using the current APIs. Tools have been built on this API that care very much about the Objective-C type hierarchy.

Hints welcome - I'd certainly be very happy to use the existing APIs if you felt like documenting how to walk the hierarchy without adding new APIs...

I'm happy to spend some time implementing these, but we should probably agree on what the interfaces should look like first. I would propose something like this:

// Returns the cursor for the class @interface when passed a category @interface or @implementation cursor, invalid cursor otherwise.

CXCursor clang_getObjCCategoryClassInterface(CXCursor);

This seems reasonable, although I'm not certain it is strictly necessary. That said, I don't have strong feelings either way. It certainly is clear on what it does.

Again, if there's a way that actually works of doing this without a new API, I'd be happy to use it.

// Returns the number of superclasses of the given class (always returns 0 or 1 for ObjC classes, may return more for C++). Returns 0 if the cursor is not a valid ObjC or C++ class interface or implementation decl

unsigned clang_getNumberOfSuperclasses(CXCursor);

// Returns a cursor for the interface to the nth superclass of the cursor indicated by the first argument.

CXCursor clang_getSuperclass(CXCursor, unsigned);

Since these APIs are specific to Objective-C cursors, I'd prefer to include 'ObjC' in the class names. How about:

unsigned clang_getNumObjCSuperClasses(CXCursor C);
CXCursor clang_getObjCSuperClass(CXCursor C, unsigned idx);

That matches the current coding style.

I think this is a succinct API; the only thing I'm mixed on is that this really isn't exposing more raw functionality in the library. We've been trying to keep the API surface parsimonious.

For ObjC, there is never more than one superclass, so simply clang_getObjCSuperClass(CXCursor) would be sufficient. I was intending to support C++ classes via the same API, so users only needed one code path to build C++ and ObjC class hierarchies (the same cursor will never be both a valid C++ and ObjC class, so the hierarchies will be distinct).

David

-- Sent from my brain

I think this is a succinct API; the only thing I'm mixed on is that this really isn't exposing more raw functionality in the library. We've been trying to keep the API surface parsimonious.

For ObjC, there is never more than one superclass,

Ah, right!

so simply clang_getObjCSuperClass(CXCursor) would be sufficient. I was intending to support C++ classes via the same API, so users only needed one code path to build C++ and ObjC class hierarchies (the same cursor will never be both a valid C++ and ObjC class, so the hierarchies will be distinct).

With C++, we have the CXCursor_CXXBaseSpecifier cursor. It encapsulates both the reference to the super class as well as the access control and kind of inheritance.

We should probably not to overly collapse APIs for querying the two object type systems from Objective-C and C++. While they both have the notions of superclasses, they semantically are very different.

Hmm. I'm seeing:

$ cat test.m
@interface Foo @end
@implementation Foo (Bar)
- (void)bar {}
@end

$ ~/llvm-cmake/bin/c-index-test -test-load-source all test.m | grep -v invalid
// CHECK: test.m:1:12: ObjCInterfaceDecl=Foo:1:12 Extent=[1:1 - 1:20]
// CHECK: test.m:2:1: ObjCCategoryImplDecl=Bar:2:1 (Definition) Extent=[2:1 - 4:2]
// CHECK: test.m:2:1: ObjCClassRef=Foo:1:12 Extent=[2:1 - 2:2]
// CHECK: test.m:3:1: ObjCInstanceMethodDecl=bar:3:1 (Definition) Extent=[3:1 - 3:15]
// CHECK: test.m:3:13: UnexposedStmt= Extent=[3:13 - 3:15]

The third line is the class reference, which we can get to from the ObjCCategoryImplDecl. Unless I'm completely confused, basically you are trying to get from '@implementation Foo (Bar)' to the cursor for '@interface Foo'?

Don't know if you have solved this or not but have you looked in the Objective-C rewriter (lib/Rewrite/RewriteObjC.cpp)?

Ted gave me some hints - I'll write up some proper documentation when I have time (hopefully next week).

The rewriter is not part of libclang, it is one of the lower-level (subject-to-change-without-notice) APIs, so it's not such a great idea for things that don't want to have to track clang API changes.

David

-- Sent from my IBM 1620

I though you could look there and see how the Clang APIs are used.