Improve Clang-Doc Usability

Description of the project: Clang-Doc is a C/C++ documentation generation tool created as an alternative for Doxygen and built on top of LibTooling. This effort started in 2018 and critical mass has landed in 2019, but the development has been largely dormant since then, mostly due to a lack of resources.

The tool can currently generate documentation in Markdown and HTML formats, but the tool has some structural issues, is difficult to use, the generated documentation has usability issues and is missing several key features:

  • Not all C/C++ constructs are currently handled by the Markdown and HTML emitter limiting the tool’s usability.
  • The generated HTML output does not scale with the size of the codebase making it unusable for larger C/C++ projects.
  • The implementation does not always use the most efficient or appropriate data structures which leads to correctness and performance issues.
  • There is a lot of duplicated boiler plate code which could be improved with templates and helpers.

Expected result: The goal of this project is to address the existing shortcomings and improve the usability of Clang-Doc to the point where it can be used to generate documentation for large scale projects such as LLVM. The ideal outcome is that the LLVM project will use Clang-Doc for generating its reference documentation.

Successful proposals should focus not only on addressing the existing limitations, but also draw inspiration for other potential improvements from other similar tools such as hdoc, standardese, subdoc or cppdocgen.

Skills: Experience with web technologies (HTML, CSS, JS) and an intermediate knowledge of C++. Previous experience with Clang/LibTooling is a bonus but not required.

Project size: Either medium or large.

Difficulty: Medium

Confirmed Mentor: @petrhosek, @ilovepi

2 Likes

Hi I’m interested in taking on this project, I’d say I am quite familiar with web tech and I’m an intermediate when it comes to c++. I’d always had a passing interest in working in compilers so I think this would be great project to dip my toes into LLVM while still having retaining some familiarity

I’ve been playing around with clang-doc this past days, and I’ve written up some preliminary research on my findings regarding this project this draft is still a work in progress but I’d would appreciated it if one of the mentors reviewed it.

My preliminary research can be found here:

1 Like

I’m glad to see the interest, your research looks great so far!

Regarding the supported C/C++ constructs, rather than trying to be as comprehensive as possible, I’d drive the development by practical use cases. For example, if our goal is to replace Doxygen for LLVM reference documentation, we only need to support C++17 for now and can probably ignore C++20 constructs. That would give us more time to focus on other improvements.

I think we should focus on the core functionality before we start considering extensions, especially when those can be provided by other services. For example, search can provided by existing search engines, there’s no need for us to spend time implementing that functionality. I’d rather focus on usability, both of the tool itself and the generated output.

Clang supports parsing of Doxygen commands and can check their content with -Wdocumentation. Doxygen introduced support for Markdown in version 1.8.0. I think we should extend Clang’s comment lexer and parser to also support parsing Markdown. This would be a great improvement on its own and excellent introduction to the internals of Clang.

2 Likes

Hi sorry for the late response,
I’m currently working on a draft proposal, and I’m having trouble with subdoc generating llvm documentation so I’m not really able to study its output compared with the existing doc generator.

I’m am also thinking of setting up a little test website for the documentation output of clang-doc for llvm that way we be able to get feedback from community.
Although I’d like to get the html output to be better before we solicit feedback from the community.

What do you think?

@PeterChou1 I’m glad to hear you’re working on the proposal. For the purposes of the proposal I don’t think it’s necessary to completely evaluate all the alternatives. Having some idea about how they stack up and where clang-doc has deficiencies is likely enough for you to draft a high quality proposal.

I really appreciated the details you’ve already included in your analysis and, so far, I think it’s quite thorough.

The documentation website is certainly a convenient way to share results with the community, but I would suggest you focus on the proposal for now.

If you’d like feedback or have questions please feel free to contact either @petrhosek or me.

Hi just side note that came up while I was researching.

I came across quite similar project within the LLVM project. Symbol Graph is a language agnostic json format intermediate format use to encode source level API information used in Apple’s Swift documentation generator. Currently clang can emit symbol graph output using the --extract-api option. Before extract-API only supported C/Objective-C however last year C++ was added as a GSOC project.

Since there is considerable overlap with clang-docs YAML intermediate output and extract API output for C++ I was wondering if we could leverage extract API and get clang-doc to use the extract API output to produce its own documentation output.

The downside is that this would require significantly refactoring how clang-doc currently works and extending clang-doc for markdown and doxygen command support would require changing the Symbol Graph spec.

The upside would be that there wouldn’t be two duplicated effort for an intermediate output for source level API information for C++ in LLVM

I’m excited to announce that the this project was selected and @PeterChou1 will be working on improving Clang-Doc usability this summer. We’re looking forward to see the outcome and I hope we can deliver improvements to LLVM reference documentation that will benefit the entire community!

4 Likes

There is also a modern style for Doxygen-generated documentation. Hope it could serve as an inspiration too: Doxygen Awesome: Doxygen Awesome