[RFC] Add Template Mustache Language to the Support library

This RFC is a follow up from:

Proposal

There are numerous places in LLVM codebases where we generate HTML

Each place in the code has its own ad-hock way of generating HTML via C++. This can make it hard make a change to the HTML output since the application logic is mixed up with the presentation logic.

The idea is to introduce an existing template language spec (Mustache) into the llvm support library for future/existing tools to use, which enables easier reasoning of the output.

The library will implement the mustache spec specified here GitHub - mustache/spec: The Mustache spec.

Implementation

Currently the library/tests is implemented in the PR below with most of the mustache spec implemented with some exceptions

This library was evolved out of my GSOC project to improve clang-doc (a new documentation generator for C++). I was able to cut down on the HTML back end output by around 500 lines and greatly simplifying the HTML generation.
Also by separating out the mustache presentation files you can also change the HTML output without recompilation of the tool.

To view the PR of the use of the library see here:

3 Likes

Can you elaborate how you picked Mustache out of everything else in this domain?

Not sure this is a stable enough project to be widely used by LLVM…

Last commit was 2 years ago and the rationale behind this library is:

I like writing Ruby. I like writing HTML. I like writing JavaScript.
I don’t like writing ERB, Haml, Liquid, Django Templates, putting Ruby in my HTML, or putting JavaScript in my HTML.

Does not inspire confidence. Can you make a list of all options, pros and cons? This needs to be a wider decision, I think.

I’m not saying the current method is good, just that if we’re going to change and refactor everything, we need to pick a more future-proof solution.

1 Like

Well, the library itself is sort of irrelevant as the author is providing an implementation.

There is some specification of the syntax GitHub - mustache/spec: The Mustache spec. - which beats inventing our own.
And because it has been widely ported/used there are some documentation and tutorials we can point people towards (quick google search as an example https://www.baeldung.com/mustache)
Mustache (template system) - Wikipedia

3 Likes

I’m not sure what this means. The LLVM community is not the best group to maintain such code base long term, in case the author drops support. We had this exact problem with Phabricator and we’d like to avoid repeating the same mistake.

I don’t think comparing any library to “inventing our own” is relevant. We should not invent our own in any case. It’s just a matter of which library out there is most stable and has the highest likelihood of being maintained long term.

“Widely used” compared to what? Is this the “most widely used”? Does it have the largest user base? To me it looks abandoned, seeing that the last commit to main was 2 years ago…

Remember, this is a project to document LLVM stuff, and not a web project in its own with freedom to pick any library that looks nice. Long term maintenance, release stability, large developer and user base are more important facts than supported features, languages it works on, etc.

LLVM developers are not (usually) web people. We struggled to maintain Phabricator, mailing lists, web pages, our LNT benchmark page is really slow to the point that it’s unusable. Our choice of Discourse/Discord is mainly because we don’t want to have to maintain anything.

In that sense, anything that works with Github pages has a higher ROI than some home brewed infrastructure, however nice the latter is.

Are you aware of any library that we could use? I am certainly not (we would want as you say something maintained, self contained, etc, with the right license, fairly small, in C++, etc)
In the absence of such library, the next best thing is to use an existing language, which this is proposing. And again that language need to be fairly simple to reduce maintenance cost.

1 Like

I’m one of @PeterChou1‘s mentors for GSOC, along with @petrhosek. I think one of the big benefits of using Mustache, and one of the reasons I assume @cor3ntin suggested it in the previous RFC, is that it’s very simple.

The implementation of the new library is about 800 lines of C++, and I think we can bring that down a bit more. The rest of the patch is mostly a set of conformance tests based on the spec.

@PeterChou1 also has a prototype cli tool in a separate patch . Right now, it’s primarily used for testing by using it with the external test suite.

The other thing is that it looks like we can drop a significant amount of C++ code from clang-doc once this is in place. Our rough plan is to use the templates for both HTML and Markdown, if we can, which should allow us to drop a lot of boilerplate code that’s been difficult to reuse between the various clang-doc backends. That’s more a direct result of using any templating language, rather than mustache specifically, though. But I think it may be much harder to maintain a library that needs to implement a more complex template specification. @PeterChou1 can probably offer more insight on that, since he’s the one that did the actual work.

For the other places that we generate HTML, is there a need for something more complicated than mustache?

2 Likes

Who is supposed to maintain this code long term?

I think there’s been a misunderstanding about how the library is intended to be used w/in LLVM. I’ll try to clarify some of the places I think are off base.

The library implements the spec. It allows us to avoid any dependencies. It just fills in the template, per the public specification (which very seldom changes).

In clang-doc we generate mostly static pages, and the library fills those in for us. We’re not taking a dependency on an external JavaScript library, or requiring anything to run server side. We use the new Mustache library when we generate the documentation, and that’s it. If the spec changes, it will be up to us to decide if we up-rev or not.

I don’t know if I follow this, it would be our library, and solely live in llvm/Support. If another part of the project wants to take a dep on a web library, that’s a different discussion and proposal altogether.

First, it’s hard to quantify how “popular” any kind of technology really is, but anecdotally, any time I’ve read something about templating languages Mustache shows up. A quick google search for “mustache template” shows lots of relevant hits, for whatever that may be worth.

This focus on the spec repository and its last update is irrelevant. Mustache is intentionally simple by design, and the spec seldom changes. The implementation of the library would be entirely in LLVM as C++ code. It would have not external dependencies whatsoever (aside from perhaps wanting to run the conformance tests from the spec).

This is true, but templating is just very simple parsing, and textual replacement. We should be pretty OK w/ simple parsers and string concatenation. :slight_smile:

We already have projects that are generating static HTML in-tree. Typically these are developer facing tools, like coverage reports, or the output of static analysis. The library in question is for simplifying those use-cases, and not focused on impacting our existing web infrastructure (like llvm.org). Compatibility w/ GitHub pages isn’t viable for a documentation generation tool, like clang-doc, and isn’t useful for the other in-tree use cases either, from what I can tell.

Does this help clarify the situation, and assuage your concerns @rengolin?

5 Likes

Are we forking the project into LLVM? What will happen with the original project?

I’d say we shouldn’t have in the first place, but that’s a separate discussion.

My point is that HTML technologies come and go and they have nothing to do with LLVM. This whole thing needs to go outside of LLVM and be generated elsewhere, which can change with the times much more often (and with much lower cost) than what we’re having now.

Coverage reports should be done by a separate tool (coverity?), Clang/LLVM reports shouldn’t include HTML in the first place, at least not generated natively, but converted into HTML by an external process.

If that external process is Mustache-based or not is irrelevant, as long as we dissociate the HTML generation from the data generation.

It may be just me, but having an HTML generation library anywhere in the LLVM monorepo makes no sense at all.

It’s worth noting, I’m not saying this work shouldn’t be done. On the contrary, this is important work that should be done extensively across the repository. Just not adding HTML tools to LLVM, but using external tools to convert data to reports outside LLVM.

(Edited to add: This is my personal opinion. If I’m the only one, then feel free to ignore me).

No it isn’t a fork. The PR above is an independent implementation based on the specification. There are already other libraries that implement the spec for a variety of languages (C+, Java, JavaScript, etc.). Having our own copy is nice in the sense that it avoids external dependencies, licencing concerns, etc. One more independent implementation in LLVM doesn’t seem important to me one way or the other w.r.t. the original project

2 Likes

I see, so we’re relying on the fact that this specification is stable and will remain so for the foreseeable future. If they decide to change direction, improve the spec or etc., we’d be responsible for supporting the new stuff or selectively ignore. Just like C compilers in the 80s, the fragmentation was daunting and we moved on to upstream compilers for that reason.

That’s not a good argument. We can say the same about all external libraries we rely on, and I don’t think we should start re-implementing everything in-house. This is the key issue here.

We moved away from in-house tools because we are not good at keeping up with them, and here, we’re repeating the same mistakes. @tstellar

Hi Renato,

Thank you for your feedback.

Regarding your concern about the stability of the Mustache specification, it’s intentionally minimalistic and designed to be logic-less. Its simplicity has contributed to its stability over time, with the specification remaining largely unchanged.

As for the risk of fragmentation, our implementation of the Mustache library is intended for internal use within LLVM tools like clang-doc and other utilities that generate HTML or Markdown. We’re not developing a general-purpose templating engine for widespread adoption outside of LLVM, so I believe the risk is minimal.

Additionally, the maintenance burden of the entire library is only about 600 lines of code, making future support straightforward. The Mustache library isn’t limited to generating HTML; it can be utilized anywhere in the codebase where we need to generate formatted output, allowing for a clear separation between application logic and presentation logic.

2 Likes

This is a good idea, IMO. We do weird ad-hoc generation of HTML in way too many places as you’ve seen. So some sort of llvm-standard way of generating this is a great idea.

Language Choice: I think Mustache is about as good and lightweight as we’re going to get here. The spec is REALLY stable (it hasn’t meaningfully changed in a VERY long time), and so any concerns about them meaningfully changing the spec are ill-founded. I suspect this will be like most of ADT/Support: Someone comes along and adds an improvement/fix every once in a while, and for the rest of the time it just stays stable.

Implement/external dependency: The Mustache spec is REALLY easy to implement. As others have said, ~800 lines seems about right, the Boostache implementation ended up being ~2000 lines of code, but that included all the necessary Boost-isms. Its also shockingly easy to maintain, you don’t even have to know anything about HTML other than “tags look like <whatever>” and get closed with “</whatever>”, so it is a good candidate for implementation.

Pulling in an external library is a large hassle that is likely to result in greater amounts of churn, rather than the brief churn from getting this to a ‘fully maintained’. I’m not sure we want to sign ourselves up for dealing with a dependency that might change outside of our codebase, or the absurdity of version changes (ala Sphinx) that we could get stuck with. AND, finding one with a license that is acceptable to the LLVM community is likely more effort than just doing/maintaining an implementation.

SO: I’m very much in favor of the current proposal. I could POSSIBLY wish for some sort of analysis/etc on re-licencing boost-mustache, but my 10-minute analysis while writing this seems like pulling out the rest of the ‘boost-isms’ would be more work than the implementation above was.

4 Likes

I think it’s worth pointing out that Mustache does HTML-escaping by default: “All variables are HTML escaped by default.”

That’s a worthwhile improvement. If it is as small as advertised, it seems worth bundling.

I think this patch is ready to land, @ilovepi, @cor3ntin, @Endill, @erichkeane
let me know what you think

This needs Aaron to judge consensus/approval of the RFC before we can accept the patch.