C++ class methods info in clang-indexer database

Hello everyone.
A while ago I asked this question Any info on C++ class methods seems to be absent in a clangd index, while C functions are indexed properly · Issue #1234 · clangd/clangd · GitHub in the bugtracker, but I didn’t get any reply. Maybe I’d get a better luck here?

I tried to describe the issue in the most details in bugtracker/discussion board, but for those who don’t want to go there, a short context: I tried to index my C++ codebase with clangd-indexer from LLVM14. It worked, but the database contained only decent information on C functions. C++ methods on a contrary only had a name, why arguments, return types, template arguments &etc - all of that were omitted. Did I do something wrong, or it’s a known restriction? Can I get more information on why it hadn’t been implemented? I’m interested in having that info in database, where can I learn more details on the issue, so maybe I could make some PRs with improvements?

Hi, and apologies for the lack of response on the bug tracker.

The index tries to store enough data that clangd can respond to requests quickly and useful, but not so much that it storage/ram requirements become excessive. Clangd’s index doesn’t make any attempt to be a general-purpose database for other applications, though the code might be interesting if you were trying to build such a thing.

Arguments etc are mostly used for serving code completion requests. For class methods, code completion is only possible after Class:: or obj., and in either case clang can resolve the type and find the available methods, without help from the index, so there’s no need to store that information.

Hi Sam!
Thanks for the quick reply! Yeah, I get it, every tool is built to do some particular job.

Ok, can I then ask for an advice?
I would like to have a generic code indexer that could create a database of all entities defined in a codebase along with a backlinks to points of definition in source code: functions & methods with all their arguments and call sites, structs&classes, all templates instantiated. maybe variables &etc. clangd-indexer seem to already been doing a very notable part of this job, but unfortunately for me - not all. I’m completely not familiar with clangd-indexer internals and am wondering if it is a good idea to take clangd-indexer as a foundation of such solution? Is it extendable enough? Is it’s internals sufficiently well documented? So do you think it makes sense to extend it instead of writing own specialized indexer?

Thanks!

I can’t speak for Sam but I can offer my 2c on your questions:

I would like to have a generic code indexer that could create a database of all entities defined in a codebase along with a backlinks to points of definition in source code: functions & methods with all their arguments and call sites, structs&classes, all templates instantiated. maybe variables &etc. clangd-indexer seem to already been doing a very notable part of this job, but unfortunately for me - not all. I’m completely not familiar with clangd-indexer internals and am wondering if it is a good idea to take clangd-indexer as a foundation of such solution?

I think that depends on how you intend to use the database.

Clangd’s index is fairly specialized for answering the types of queries that come up during clangd’s usage, such as “find all symbols with this name in the project” or “find all references to this symbol in the project”.

One can imagine interesting semantic queries one might want to pose to such a database – such as “find all uses of this type in this project”, or “given this template, find all its instantiations in the project” – which clangd’s index is not designed to answer, and likely could not be made to answer without significant changes (like introducing some sort serialized representation of types and other semantic entities besides symbols).

So, I think it really depends on what are the missing pieces that you’d like the index to be able to do. If you elaborate on this, I can try to offer further thoughts.

Is it extendable enough?

Depends on how you’re looking to extend it. If you’re hoping to use the upstream indexer code unmodified, and extend it in your own codebase by inheritance or composition without modifying the upstream sources, it’s probably not suitable for that.

If instead you plan to have a copy or fork of clangd’s code and make modifications to it, that could work (e.g. depending on the new functionality you want, you may be able to implement it as some incremental modifications to the existing code without any significant rewrite).

Is its internals sufficiently well documented?

There is some documentation on this page, and comments in the code headers such as this. You can also ask further questions about it here or in the #clangd Discord channel.

If you end up working with the indexer and identifying gaps in the documentation that would aid future efforts along these lines, documentation patches are always welcome :slight_smile:

2 Likes

Ok, I got it, thank you!
Currently I have a pretty vague idea and requirements for the tool I’m thinking about, so I even can’t elaborate much now… I may get back once I clarify that…
But thanks a lot, that helped!