[RFC] clang-doc templating language

This proposal is to include a templating language for clang-doc which will form part of my google summer of code project.

Background

Clang-doc is a documentation generator developed on top of libtooling developed as an alternative to Doxygen.The tool currently emits documentation for several targets including markdown and html. However structurally the current method of generating HTML, Markdown and other formats is unergonomic and cumbersome.As an example here is code that generates HTML for functions in clang-doc

Out.emplace_back(std::make_unique<TagNode>(HTMLTag::TAG_H3, I.Name));
// USR is used as id for functions instead of name to disambiguate function
// overloads.
Out.back()->Attributes.emplace_back("id",
                                   llvm::toHex(llvm::toStringRef(I.USR)));


Out.emplace_back(std::make_unique<TagNode>(HTMLTag::TAG_P));
auto &FunctionHeader = Out.back();


std::string Access = getAccessSpelling(I.Access).str();
if (Access != "")
 FunctionHeader->Children.emplace_back(
     std::make_unique<TextNode>(Access + " "));
if (I.ReturnType.Type.Name != "") {
 FunctionHeader->Children.emplace_back(
     genReference(I.ReturnType.Type, ParentInfoDir));
 FunctionHeader->Children.emplace_back(std::make_unique<TextNode>(" "));
}
FunctionHeader->Children.emplace_back(
   std::make_unique<TextNode>(I.Name + "("));


for (const auto &P : I.Params) {
 if (&P != I.Params.begin())
   FunctionHeader->Children.emplace_back(std::make_unique<TextNode>(", "));
 FunctionHeader->Children.emplace_back(genReference(P.Type, ParentInfoDir));
 FunctionHeader->Children.emplace_back(
     std::make_unique<TextNode>(" " + P.Name));
}
FunctionHeader->Children.emplace_back(std::make_unique<TextNode>(")"));


if (I.DefLoc) {
 if (!CDCtx.RepositoryUrl)
   Out.emplace_back(writeFileDefinition(*I.DefLoc));
 else
   Out.emplace_back(writeFileDefinition(
       *I.DefLoc, StringRef{*CDCtx.RepositoryUrl}));
}
std::string Description;
if (!I.Description.empty())
 Out.emplace_back(genHTML(I.Description));

Which generates:

<h2 id="Functions">Functions</h2>
<div>
 <h3 id="8778F1EDB8C49A8CD3BB2031F6D0BD65D54A41AD">Circle</h3>
 <p>public void Circle(double radius)</p>
 <p>Defined at line 3 of file ./src/Circle.cpp</p>
 <div>
   <div>
     <div>
       <div>brief</div>
       <p> Constructs a new Circle object.</p>
     </div>
   </div>
 </div>
 <h3 id="3424D0EF00ECAE4F0118480F02BEC82AFD08355C">area</h3>
 <p>public double area()</p>
 <p>Defined at line 5 of file ./src/Circle.cpp</p>
 <div>
   <div>
     <div>
       <div>brief</div>
       <p> Calculates the area of the circle.</p>
     </div>
     <div>
       <div>return</div>
       <p> double The area of the circle.</p>
     </div>
   </div>
 </div>

This code is problematic because it is not intuitive what the output of this code generates. Since clang-doc outputs include HTML and Markdown, it is much more intuitive to emit some sort of language similar to the output to facilitate faster iteration for the tools output.

Instead of a programmatic way to generate the output, the suggestion is create a dynamic typed templating language which would enable clang-doc to have a unified way to generate every output.

Templating Language

This hypothetical templating language would work similarly to other similar templating languages like Jinja for python. Below is the hypothetical replacement for the above code for generating the HTML output

Template Rendering

A basic template would render to the output by taking in a std::string, llvm::json object additionally there would also be an option to take in a file object a template

Statements

Statements are wrapped in {% %} they can contain loops, conditionals, or component statements

Loops

Loops work like the example below, in the scope of the loop the iteration variable is exposed

<div>
{% for function in Functions %}
 <h3 id="{{function.USR}}">{{function.Name}}</h3>
 <p>{{function.QualifiedName}}</p>
 <p>Defined at line {{function.Loc.Line}} of file{{function.Loc.File}}</p>
{% endfor %}
</div>

This would output

<div>
 <h3 id="8778F1EDB8C49A8CD3BB2031F6D0BD65D54A41AD">Circle</h3>
 <p>public void Circle(double radius)</p>
 <p>Defined at line 3 of file ./src/Circle.cpp</p>
 <h3 id="3424D0EF00ECAE4F0118480F02BEC82AFD08355C">area</h3>
 <p>public double area()</p>
 <p>Defined at line 5 of file ./src/Circle.cpp</p>
</div>

Variables

Variables are evaluated by being wrapped in {{ }} the evaluation of variables would be based on the json object supplied

Conditionals

Conditional support includes if, else if, else. The templating language would also support boolean operators or, and, not

Here is one example of it working

<div>
{% if !function.IsPrivate %}
 <h3 id="{{function.USR}}">{{function.Name}}</h3>
 <p>{{function.QualifiedName}}</p>
 <p>Defined at line {{function.Loc.Line}} of file{{function.Loc.File}}</p>
{% endif %}
</div>

Components

To prevent repeating structures we define a special keyword component which allows us to reuse snippets of templating code. We can optionally specify variables we want to pass down with the with keyword

Here is an example

<h2 id="Functions">Functions</h2>
<div>
{% for function in Functions %}
<h3 id="{{function.USR}}">{{function.Name}}</h3>
<p>{{function.QualifiedName}}</p>
<p>Defined at line {{function.Loc.Line}} of file {{function.Loc.File}}</p>
{% component ‘child.template’ with function.Description as Description%}
{% endfor %}

Child.template

{{for description in Descriptions}}
<div>
   <div>
     <div>
       <div>brief</div>
       <p> Calculates the area of the circle.</p>
     </div>
     <div>
       <div>return</div>
       <p> double The area of the circle.</p>
     </div>
   </div>
</div>
{{endfor}}

Rather than implementing an entire templating engine which is a significant undertaking, I’m wondering if we could use a more of an embedded DSL approach akin to JSON support in LLVM?

After a quick search, I found GitHub - SRombauts/HtmlBuilder: A simple C++ HTML Generator which is essentially what I had in mind but there might be other implementations of this idea we could use as an inspiration.

There are other places in LLVM where we already generate HTML:

There’s probably some more that I missed.

Whatever solution we end up implementing should ideally support all those use cases and should live in the LLVM Support library.

4 Likes

Did you explore whether it would be possible to use Mustache instead of inventing our own thing?
We would either need to find a small mustache library compatible with our policies / license (we usually avoid taking on dependency if we can avoid it) - or implement it - i don’t know how much effort that would be.

Mustache seems like a ideal language and it doesn’t look too difficult to implement, I found this header only library that implements it in around 1200 lines

I think it would do well to re-implement support for it in LLVM