API for auto type deduction in libclang

Hey guys,

I'm also part of the group trying to integrate Clang into KDevelop. We're
using the C interface to Clang.

I'm currently struggling to figure out how to get the automatically deduced
type of a C++11 auto type. What we basically do atm is the following:

Assume we have some code such as 'auto i = 42;'

Following situation:
- We have a CXCursor pointing to the declaration
- Now clang_getCursorType(cursor) at this point
  just returns CXType_Unexposed. This indicates some internal type.
- clang_getTypeSpelling(cursor) then just gives us 'auto'.
  Which is somewhat correct, but not what we really want.
- We'd like to know the deduced type

So the question is:
Is there some way to get the deduced type for 'i' (obviously 'int' here)?

Through debugging, I found out that C++11 auto types are internally handled as
clang::AutoType. So, right now I'm missing API in the C interface, such as
'CXType clang_getAutoDeducedType(CXType)' that internally calls something like
clang::AutoType::getDeducedType() (from AST/Type.h).

I actually tried to patch Clang (excerpt attached) to get a proof-of-concept,
but apparently AutoType::getDeducedType() always returns a null QualType for
me. Hence the resulting type we get is garbage. Is AutoType::getDeducedType()
even the correct method to use here? Is this approach completely wrong?

Any hints are welcome

Thanks

autotypededuction.patch (661 Bytes)

This seems to work without modifications to libclang:

CXType type = clang_getCursorType(cursor);
CXType deducedType = clang_getCanonicalType(type);
assert(deducedType.kind == CXType_Int);

That doesn't work for me.

For testing we have a small wrapper binary that basically creates a
CXTranslationUnit and then traverses through the AST via
clang_visitChildren().

During the call to 'visit' it does the following (amongst other things):
- auto location = clang_getCursorLocation(cursor)
- auto type = clang_getCursorType(cursor)
- auto typeString = clang_getTypeSpelling(type)
- now: auto canonicalTypeString =
    clang_getTypeSpelling(clang_getCanonicalType(type))

It outputs all the values of those variabes to stdout, and when I pass "auto i
= 5;" to it I get:

"""
decl: "auto (canonical type: auto) i " of kind VarDecl (9) in stdin.cpp@1:6
  "int (canonical type: int) " of kind IntegerLiteral (106) in stdin.cpp@1:10
"""

So the 'canonical type' of VarDecl still resolves to 'auto', instead of 'int'.
Sorry, if we're doing something completely wrong, but I don't seem to get this
working as you suggest.

Greets

Ok, I see, you're using clang_getTypeSpelling. I created my tool before clang_getTypeSpelling was available. For aggregates, typedefs and a couple of other types I'm using the following code[1]:

auto cursor = clang_getTypeDeclaration(type);
auto str = clang_getCursorSpelling(cursor);

For basic types like "int", "char" and so on, the above will return an empty string. For those types I use a big switch statement on the type kind and just returns a string representation [2].

[1] https://github.com/jacob-carlborg/dstep/blob/master/dstep/translator/Type.d#L43

[2] https://github.com/jacob-carlborg/dstep/blob/master/dstep/translator/Type.d#L248

Unfortunately, your source code doesn't help me, nor can I get it to work
using any combination of getTypeDeclaration, getCursorSpelling, and the ones I
referred to earlier...

I wonder if there's some bug in Clang's type printers I'm experiencing here.

Let's do some testing:
Test file: test.cpp, containing 'auto i = 5;'

$ clang -cc1 -std=c++11 -ast-dump test.cpp
(...)
`-VarDecl 0x1c2e5b0 <test.cpp:1:1, col:10> i 'int':'int'
  `-IntegerLiteral 0x1c2e608 <col:10> 'int' 5

=> What I'd expect! ('auto' => 'int')

Next try: My clang-standalone-parser (basically emulating 'clang -cc1 -dump')
but using libclang only [1]:

$ ./clang-standalone-parser -std=c++11 test.cpp
test.cpp:1:6 (5, 0-10) kind: VarDecl type: auto display name: i (...)
  test.cpp:1:10 (9, 9-10) kind: IntegerLiteral type: int

=> Not what I'd expect, 'auto' is not deduced

And the odd part here is: Breaking on the symbol
'clang::AutoType::getDeducedType()' for both clang and clang-standalone-parser
shows that *both* versions actually call that function when trying to find a
string representation for the type 'auto'. But with clang-standalone-parser,
'clang::AutoType::getDeducedType()' always returns an null QualType.

Can someone make any sense out of this? Bug in Clang/LLVM? Note that the
backtrace towards the call of getDeducedType is slightly different for the two
versions (see attached file). Maybe this is the reason?

Any help greatly appreciated!

Greets

[1] http://quickgit.kde.org/?p=kdev-clang.git&a=blob&h=dbad9e8942cc5e20fd005677fc32a5a0b54977ae&hb=1fa9eca7a9a58404c3ef687fcfa63800d459cff5&f=tests%2Fclang-standalone-parser.c

Compiling is simple:
$ clang clang-standalone-parser.c -I/path/to/llvm/include -L/path/to/llvm/lib
-lclang -Wl,-rpath,'/path/to/llvm/lib' -o clang-standalone-parser

backtraces.txt (5.99 KB)

That doesn’t work for me.

For testing we have a small wrapper binary that basically creates a
CXTranslationUnit and then traverses through the AST via
clang_visitChildren().

During the call to ‘visit’ it does the following (amongst other things):

  • auto location = clang_getCursorLocation(cursor)
  • auto type = clang_getCursorType(cursor)
  • auto typeString = clang_getTypeSpelling(type)
  • now: auto canonicalTypeString =

clang_getTypeSpelling(clang_getCanonicalType(type))

It outputs all the values of those variabes to stdout, and when I pass
“auto i = 5;” to it I get:

“”"
decl: "auto (canonical type: auto) i " of kind VarDecl (9) in
stdin.cpp@1:6

"int (canonical type: int) " of kind IntegerLiteral (106) in
stdin.cpp@1:10

“”"

So the ‘canonical type’ of VarDecl still resolves to ‘auto’, instead of
‘int’. Sorry, if we’re doing something completely wrong, but I don’t seem
to get this working as you suggest.

Ok, I see, you’re using clang_getTypeSpelling. I created my tool before
clang_getTypeSpelling was available. For aggregates, typedefs and a
couple of other types I’m using the following code[1]:

auto cursor = clang_getTypeDeclaration(type);
auto str = clang_getCursorSpelling(cursor);

For basic types like “int”, “char” and so on, the above will return an
empty string. For those types I use a big switch statement on the type
kind and just returns a string representation [2].

[1]
https://github.com/jacob-carlborg/dstep/blob/master/dstep/translator/Type.d#
L43

[2]
https://github.com/jacob-carlborg/dstep/blob/master/dstep/translator/Type.d#
L248

Unfortunately, your source code doesn’t help me, nor can I get it to work
using any combination of getTypeDeclaration, getCursorSpelling, and the ones I
referred to earlier…

I wonder if there’s some bug in Clang’s type printers I’m experiencing here.

Let’s do some testing:
Test file: test.cpp, containing ‘auto i = 5;’

$ clang -cc1 -std=c++11 -ast-dump test.cpp
(…)
-VarDecl 0x1c2e5b0 <test.cpp:1:1, col:10> i 'int':'int' -IntegerLiteral 0x1c2e608 col:10 ‘int’ 5

=> What I’d expect! (‘auto’ => ‘int’)

Next try: My clang-standalone-parser (basically emulating ‘clang -cc1 -dump’)
but using libclang only [1]:

$ ./clang-standalone-parser -std=c++11 test.cpp
test.cpp:1:6 (5, 0-10) kind: VarDecl type: auto display name: i (…)
test.cpp:1:10 (9, 9-10) kind: IntegerLiteral type: int

=> Not what I’d expect, ‘auto’ is not deduced

And the odd part here is: Breaking on the symbol
‘clang::AutoType::getDeducedType()’ for both clang and clang-standalone-parser
shows that both versions actually call that function when trying to find a
string representation for the type ‘auto’. But with clang-standalone-parser,
‘clang::AutoType::getDeducedType()’ always returns an null QualType.

Can someone make any sense out of this? Bug in Clang/LLVM? Note that the
backtrace towards the call of getDeducedType is slightly different for the two
versions (see attached file). Maybe this is the reason?

I would expect that clang’s dumper is using VarDecl->getType() and your standalone tool is using VarDecl->getTypeSourceInfo(). The former produces the type of the variable (which is ‘int’); the latter produces the type-as-written (which is ‘auto’).

FWIW, we’ve been considering changing this for variables, but even if we did, the problem would persist for functions with ‘auto’ return types.

If you have a cursor pointing to the "x" in "auto x = 10;". Then calling clang_getCursorType on the cursor:

Cursor cursor;
// ..., cursor not points to "x"
auto type = clang_getCursorType(cursor);
assert(type.kind == CXType_Int);

Doesn't that work? Then use a switch like this:

string typeToString (CXType type)
{
     switch (type.kind)
     {
         case CXType_Int: return "int";
         case CXType_Double : return "double";
         //...
     }
}

It's not pretty but it should work as a workaround.

Sorry, I can't follow. Our tool uses the clang-c API - can you translate the
methods you describe above to that? Or, put differently, how can we use
"VarDecl->getType" via the clang-c API?

Thanks

No, it does not work. Have you tried the example that Kevin linked to?

wget -O clang-standalone-parser.c "http://quickgit.kde.org/?p=kdev-clang.git&a=blob&h=dbad9e8942cc5e20fd005677fc32a5a0b54977ae&f=tests%2Fclang-standalone-parser.c&o=plain"
clang clang-standalone-parser.c -I/usr/include/clang-c -l clang -o clang-
standalone-parser

echo "auto i = 1;" > test.cpp
./clang-standalone-parser --std=c++11 test.cpp

Gives me:

test.cpp:1:6 (5, 0-10) kind: VarDecl type: auto display name: i usr: c:@i
definition
  test.cpp:1:10 (9, 9-10) kind: IntegerLiteral type: int

If I add something like this to the standalone example:

printString("type.kind", clang_getTypeKindSpelling(type.kind));

I see: "type.kind: Unexposed" - your assertion fails.

Thanks

Hey,

But how do you explain that *both* versions actually call
'clang::AutoType::getDeducedType' during pretty-printing this type, but only
one version actually gets the actual type and the other doesn't? I fail to see
the link why one behaves differently here.

It would be nice if someone with a deeper understanding of libclang internals
could have a look at our small tool at [1] and check if we're doing something
wrong. Or if that just *can't* work with the current API at hand.

Sorry if I'm missing something obvious.

[1] http://quickgit.kde.org/?p=kdev-clang.git&a=blob&h=dbad9e8942cc5e20fd005677fc32a5a0b54977ae&hb=1fa9eca7a9a58404c3ef687fcfa63800d459cff5&f=tests%2Fclang-standalone-parser.c

No, it does not work. Have you tried the example that Kevin linked to?

wget -O clang-standalone-parser.c "http://quickgit.kde.org/?p=kdev-clang.git&a=blob&h=dbad9e8942cc5e20fd005677fc32a5a0b54977ae&f=tests%2Fclang-standalone-parser.c&o=plain"
clang clang-standalone-parser.c -I/usr/include/clang-c -l clang -o clang-
standalone-parser

echo "auto i = 1;" > test.cpp
./clang-standalone-parser --std=c++11 test.cpp

Gives me:

test.cpp:1:6 (5, 0-10) kind: VarDecl type: auto display name: i usr: c:@i
definition
   test.cpp:1:10 (9, 9-10) kind: IntegerLiteral type: int

I get this using libclang 3.3 on OS X:

test.cpp:1:6 (5, 0-10) kind: VarDecl type: int display name: i usr: c:@i definition
   test.cpp:1:10 (9, 9-10) kind: IntegerLiteral type: int

If I add something like this to the standalone example:

printString("type.kind", clang_getTypeKindSpelling(type.kind));

I see: "type.kind: Unexposed" - your assertion fails.

Of course I forgot to use "clang_getCanonicalType". Here's the same example updated to use "clang_getCanonicalType" and a switch statement on "type.kind" that I tried to explain:

http://pastebin.com/xMPn7kS6

It works correctly for me using libclang 3.3 on OS X. It prints "int" as the type, not "auto". Here's also the same code that uses "clang_getCanonicalType" and a switch statement on "type.kind" as I tried to explain:

http://pastebin.com/xMPn7kS6

I would consider my example only a workaround since it seems to work, at least on libclang 3.3.

I just tested this against multiple versions on my Linux (64bit) system:

For both llvm-3.3, llvm-3.4 (Ubuntu packages) and llvm-trunk (self-compiled) I
get an endless recursion in getTypeSpelling, because it always returns
CXType_Unexposed. => segmentation fault (core dumped)

Surprisingly, for llvm-3.2 (Ubuntu package) your code works, and I can
reproduce the output you get! Interesting.

So now I'm really confused: You say your example works for you on llvm-3.3,
but I cannot confirm that. What's going on here? :confused:
Should I file a bug report about that behavioral change?

Greets

I just tested this against multiple versions on my Linux (64bit) system:

For both llvm-3.3, llvm-3.4 (Ubuntu packages) and llvm-trunk (self-compiled) I
get an endless recursion in getTypeSpelling, because it always returns
CXType_Unexposed. => segmentation fault (core dumped)

You're right. I just noticed that I compiled using libclang 3.3 but it was some other version that was used during runtime. I suspect it's a release from Apple. The output of otool is:

Compiled with libclang 3.3:

$ otool -L clang-standalone-parser
clang-standalone-parser:
  @rpath/libclang.dylib (compatibility version 1.0.0, current version 0.0.0)

Compiled with Apple libclang:

$ otool -L clang-standalone-parser
clang-standalone-parser:
  @rpath/libclang.dylib (compatibility version 1.0.0, current version 500.2.79)

This is the version of Clang I'm using:

$ clang --version
Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)

So it looks like it was Apple version 5, based on LLVM 3.3 svn, that I used to compile with.

Surprisingly, for llvm-3.2 (Ubuntu package) your code works, and I can
reproduce the output you get! Interesting.

So now I'm really confused: You say your example works for you on llvm-3.3,
but I cannot confirm that. What's going on here? :confused:

It was not LLVM 3.3, see above, sorry for the trouble.

Since you say that it worked with 3.2 and it's working for me using 3.3 svn (Apple release) one could conclude that either:

A. Something broke between 3.2 and 3.3 but after Apple made their release

Or

B. Apple has fixed the issue in their release

Should I file a bug report about that behavioral change?

I would guess so. Then we will hopefully get some answer if the change was intentional or a regression.

> I just tested this against multiple versions on my Linux (64bit) system:
>
> For both llvm-3.3, llvm-3.4 (Ubuntu packages) and llvm-trunk
> (self-compiled) I get an endless recursion in getTypeSpelling, because it
> always returns CXType_Unexposed. => segmentation fault (core dumped)

You're right. I just noticed that I compiled using libclang 3.3 but it
was some other version that was used during runtime. I suspect it's a
release from Apple. The output of otool is:

Compiled with libclang 3.3:

$ otool -L clang-standalone-parser
clang-standalone-parser:
  @rpath/libclang.dylib (compatibility version 1.0.0, current version

0.0.0)

Compiled with Apple libclang:

$ otool -L clang-standalone-parser
clang-standalone-parser:
  @rpath/libclang.dylib (compatibility version 1.0.0, current version
500.2.79)

This is the version of Clang I'm using:

$ clang --version
Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)

So it looks like it was Apple version 5, based on LLVM 3.3 svn, that I
used to compile with.

> Surprisingly, for llvm-3.2 (Ubuntu package) your code works, and I can
> reproduce the output you get! Interesting.
>
> So now I'm really confused: You say your example works for you on
> llvm-3.3,
> but I cannot confirm that. What's going on here? :confused:

It was not LLVM 3.3, see above, sorry for the trouble.

Since you say that it worked with 3.2 and it's working for me using 3.3
svn (Apple release) one could conclude that either:

A. Something broke between 3.2 and 3.3 but after Apple made their release

Or

B. Apple has fixed the issue in their release

> Should I file a bug report about that behavioral change?

I would guess so. Then we will hopefully get some answer if the change
was intentional or a regression.

Hey,

I've filed a bug: http://llvm.org/bugs/show_bug.cgi?id=18669
Let's see what the developers think about it.

In any case, thanks *a lot* for your help and patience, Jacob.
Your commitment is greatly appreciated!

Greets

No problem, just glad we figured out why we had different result.

I read the bug report. I don't see how the commit that added "clang_getTypeSpelling" can have affected this. As far as I can see it only adds "clang_getTypeSpelling", it doesn't change "clang_getCanonicalType" or any of the underlying C++ code. Most of the changes are related to tests.

If you use Git you could do a bisect between the 3.2 and the 3.3 release and see if you can find the commit that causes "clang_getCanonicalType" to fail.