[Clang] ExtractAPI while building

Description of the project: Swift-DocC is the canonical documentation compiler for the Swift OSS project. However Swift-DocC is not Swift specific and uses SymbolKit's languaguage agnostic JSON-based symbol graph format to understand which symbols are available in the code, this way any language can be supported by Swift-DocC as long as there is a symbol graph generator.

Clang supports symbol graph generation for C and Objective-C as described in [RFC] clang support for API information generation in JSON.

Currently users can use clang to generate symbol graph files using the clang -extract-api command line interface or generating symbol graphs for a specific symbol using the libclang interface. This project would entail adding a third mode that would generate the symbol graph output as a side-effect of a regular compilation job. This can enable using the symbol graph format as a light weight alternative to clang Index or clangd for code intelligence services.

Expected result: Enable generating symbol graph files during a regular compilation (or module build); provide a tool to merge symbol graph files in the same way a static linker links individual object files; Extend clang Index to support all the information contained by symbol graph files.

Desirable skills: Intermediate C++ programming skills; familiarity with clang and Objective-C are assets but not required.

Project size: Medium

Difficulty: Medium/Hard

Confirmed Mentors: Daniel Grumberg, Zixu Wang, Juergen Ributzka


I am Ankur Saini, a computer science engineering student form India and am interested to work on this as my GSoC 2023 project. :slight_smile:
I am currently in the process of building clang from source and reading RFC you provided in the thread.
Just want to know if there some minor bugs/patches that I can work on before submitting actual proposal to google to get a better understanding of both source code and contribution process here at llvm.
Also I see that apart from discourse, llvm also have a discord server and a mailing list so just want to ask which one out of those would be the most prominent place to ask any further questions especially regarding this project ?

A bit about me

Hi Ankur,

I am glad you are interested in our project for your GSoC. Answers to your questions below:

I think the best way to get started would be to do a small non functional change (we call these NFC change), maybe something like fixing some typos, to get accustomed to the contribution process. Once we have done that we can find a more complex bit of work for you to do in order to learn the codebase. If you are still interested reach out to me via DM and we will find something suitable.

The mailing lists are archive only and all public discussion happens on discourse. I would say the best place for asking things specific to this project would be this thread or contact me, Zixu, or Jürgen directly. For more general questions about LLVM the rest of discourse is probably the best place.

1 Like


I was curious if this project was only reserved for GSoC, as it was not picked up this cycle. I did GSoC this summer, but not with LLVM, and liked having a defined project and mentors. I’m hoping to get some more open source experience and was curious about working on this project. Another project I was interested in ended up being completed by the mentors over the summer, so I thought I’d ask before diving deeper into the details.


This project was selected for GSoC 2023: Google Summer of Code.

Ah, my mistake. I misread the title. Please ignore the prior post.

Hey Daniel I am interested in this project and I want myself to be able to contribute to it, but honestly speaking I have no idea about it and have never used llvm in general before. Still, I know C++ to a decent level. So it would be great if you could guide me and help me do just that, I was looking to DM you directly but apparently, I was not able to do so for reasons I don’t know