RFC: Improving Clang’s Diagnostics

cjdb · May 17, 2022, 9:32pm

Abstract

Clang diagnostics are alright. That is: they get the job done more often than not, but leave a lot
of room for improvement, particularly for non-experts. This document details how one could improve
Clang so that its diagnostics cater to a much broader audience of programmers.

Summary

We propose to change the way in which diagnostics are presented to users, so that they might have better user-experience:

Add a new Clang diagnostic engine that emits machine-parsable Static Analysis Results Interchange Format (SARIF) in lieu of unstructured text.
Pave a way for new, clearer, diagnostic messages to be added.
Teases tooling that can consume machine-parsable diagnostics for better user experience.

Motivation

A Clang diagnostic is usually just a phrase that matter-of-factly states a problem in the code as output to a console. Clang’s current philosophy on diagnostics is to value terseness and impersonal language. The claim is that by aiming to keep diagnostics under 80 columns in length, terminal wrapping is avoided and forces one to think about the important point. The first premise is observably false: the author of this document routinely has short diagnostics wrap when fonts are extra large. As we’ll explore in this section, the second reason isn’t necessarily achieved; moreover, in the form Clang delivers them now, terse and impersonal diagnostics may work well for C and C++ experts, but are often lacking for less adept users of this language.

In their WG21 paper P2429 ‘Concepts Error Messages for Humans’, Sy Brand cites various tweets poking fun at compiler diagnostics as a part of the paper’s motivation. Two more examples include tweets by @Red_shirt_no2 (and the reply about crying) and @miniciv. Brand then goes on to cite the annual ISOCPP developer surveys, noting that compiler diagnostics are a top complaint from average C++ programmers. Brand’s paper details a lot of advantages about both the usefulness of compiler diagnostics and how friendly diagnostics improve developer experience. It also contrasts C++ diagnostics against diagnostics from various modern programming languages, which is something we’ll explore as well. In short, their paper is an excellent corequisite for this document, and we’ll be citing it a fair bit.

Because the ISOCPP survey results aren’t particularly nuanced, we conducted a dedicated public survey asking users both how satisfied they are with the status quo and what they’d like to see. According to Twitter analytics, over 15’760 people have seen the survey, although only 110 have responded. We learnt that the vast majority of respondents are reasonably satisfied with Clang’s diagnostics when compared to other C++ compilers (mean=3.9, median=4). There does not appear to be consensus when comparing Clang’s C++ diagnostics to other programming languages, although slightly more people have expressed dissatisfaction than satisfaction (mean=3, median=3). Roughly two thirds said that it takes them minutes to understand Clang diagnostics, with the remainder being almost equally divided between seconds (16.7%) and hours (18.5%). About half stated that they “never” find themselves spending over an hour or more trying to understand a diagnostic, that presented differently, may have been solvable in a matter of moments. The others are clustered into 1-2/week (32.4%), 2-4/week (13%), and sparingly few said 4–8 times or 9+ times a week.

We are still working through the feedback for the free-form question:

“When thinking about how other software presents errors, what do you appreciate most?”

This was answered by around half of the participants. From the discussed data, it appears that most participants are happy with Clang diagnostics when Clang is compared against its peers (GCC and MSVC), but many have flagged that their productivity could be improved in some way if we invest into new ways to communicate with developers.

We also asked for respondents to rank the three areas of diagnostics they feel should be prioritised when triaging improvements. The list of choices were overload sets, implicit conversions, undeclared symbols, templates, and concepts. Although we can very clearly say that templates have an overwhelming majority for first place, it isn’t currently clear what we should be prioritising second and third. To fairly assess this, we’ll count the results using single-transferable vote with the weighted inclusive Gregory counting method, to ensure that we fairly triage all of the issues.

Diagnostics are documentation

We posit that compiler diagnostics are a form of documentation. Wikipedia defines documentation as “any communicable material that is used to describe, explain, or instruct regarding some attributes of an object, system, or procedure, such as its parts, assembly, installation, maintenance, and use”. Compiler diagnostics are the way in which the tool communicates with the human, describing why the source cannot be translated to a target. Clang diagnostics also sometimes explain and instruct how to fix invalid source code. It’s non-traditional documentation, but it is still a form of documentation nevertheless.

In their conference talk Documentation in the Era of Concepts and Ranges, Sy Brand and Christopher Di Bella identify and discuss six fundamental attributes of good documentation. They claim that “good documentation” is fit-for-context, clear, digestible, complete, accessible, and up-to-date. We will not be focusing on up-to-dateness in this section, as that’s an issue of maintenance, but we will discuss it in the legacy section later on. Most of these attributes centre around documentation authors knowing their audience, using tools to maximise quality, measuring the quality of one’s documentation on several axes, and ensuring that documentation tools aren’t misused. We summarise their points to spare readers needing to watch an hour-long video mostly focussed on library documentation.

Fit-for-context

Documentation that’s fit-for-purpose is inextricably linked to knowing one’s audience. In the context of Clang, that means that we need to consider whether we are writing diagnostics that reflect the perspective of the user or the perspective of the compiler. These are two very different audiences: most users are not experts in the core language, and that means the way in which we communicate to them needs to be very different than the way in which we communicate with compiler developers (who are the people writing these diagnostics). We also hypothesise that hard-to-understand diagnostics discourage newcomers. Because there are far more non-expert users than compiler developers, we should be prioritising their needs ahead of our own.

Furthermore, humans are not the only audience of compiler diagnostics: scripts can and do listen to compiler diagnostics and act upon them. When analysing the status quo, we need to consider that the audience of a popular general-purpose programming language such as C++ is absurdly vast, and that there are many categories of “audience” in the mix, and act appropriately.

Based on the commentary from our survey, it seems that Clang very much is not framing its diagnostics from the perspective of a user. Although there were many comments that implied this, two responses to the question ‘When thinking about how other software presents errors, what do you appreciate most?’ jumped out:

“Tells what I did wrong and how to fix it rather than speaking in standardese”

“Error diagnostics feel like they’re designed with users in mind, whereas with cpp it feels like the main goal is just to print the state of the compiler when it gave up”.

Clarity

Clear documentation is documentation that communicates a message to the user that answers their questions, and doesn’t cause confusion or leave them with more questions than they had at the beginning. An outcome of clear documentation is that users should be able to at least recall and describe the communication from the author. In the case of Clang, a clear diagnostic would be one that results in the user being able to explain why their source is ill-formed, and hopefully
identify a reasonable resolution.

Clang diagnostics for simple things can be good enough. For example, when one forgets the closing brace on a function definition, they’re told that a closing brace is expected to match a specific opening brace. In this particular example, the Clang diagnostic is slightly better than the GCC one, because it points out the location of the lonely brace in addition to where the closing one should be. Not all Clang diagnostics for simple code are top-tier though: GCC is far clearer when it comes to keeping track of standard library names with missing headers.

An intermediate example might be operators with mismatched operands. The output here is likely functional for most programmers, but folks who are learning C++ in the era of consistent comparison operations may struggle to understand why 0 == x works, while 0 + x does not. It doesn’t help that the diagnostic draws attention to the fact that there’s “no known conversion from ‘int’ to ‘s’ for 1st argument”. Keeping with the ‘flat’ and ‘terse’ tone for now, a more informative message might look something like:

error: don't know how to add an 'int' and an 's'
  15 |    0 + x;
     |    ~ ^ ~
     |    |   |
     |    int s
note: 'operator+(int, s)' was not found, tried to convert parameters for known operator+ candidates
note: 's operator+(s, int)' is not viable because 'int' is not convertible to 's' for argument 1

This is an immediate improvement because it clearly states the problem from a human’s perspective as opposed to a compiler’s perspective: the problem according to the programmer is likely to be that 0 + x isn’t working, not that there’s no viable operator+. It then immediately follows on with context about why the operands are invalid instead of just listing all the possible alternatives out of context. Unfortunately, because it uses the same structure as the current design, it’s restricted in its ability to help the person receiving the diagnosis. Richard Smith also notes that printing out the full signature as above can create readability problems for larger signatures.

Responses from our survey strongly suggest that Clang diagnostics, while often satisfying with respect to other C++ compilers, can be rather opaque and lead to a lot of user frustration depending on the error. The survey quotes from the previous section are relevant here too. Other relevant commentary includes

“Clang is very, very bad at telling you why overload resolution failed, which clashes with modern library design that leans on overload resolution to try to provide a good user experience…”

Suggestions for fixes. Explanation for why it’s wrong

“Concisely telling me where the error is, why it’s wrong and how to fix it”.

Digestibility

While “clear” documentation focuses on being able to understand the content of a message, digestibility concerns itself with the structure and presentation of that message. Examples of facilitating digestibility can include breaking text into paragraphs, using different font style (e.g. bold, italic, etc.), using different font faces (e.g. sans-serif font for descriptions, code font for snippets, etc.), using punctuation, putting code that can’t fit into prose text into its own block, and using headings to section different topics.

Again, responses indicate that Clang can do a lot of work to improve digestibility. Colour and improved formatting seemed to have a fair number of responses, collapsible trees were mentioned several times, and visualisation. The suggested “interactive webpage” option to ‘How would you prefer the information is presented to you?’ holds around 40% of the vote (people could select multiple options). Multiple people flagged the presentation of template backtraces and candidates for overload resolution as a source of frustration, which is probably unsurprising to the reader.

Completeness

Completeness is about ensuring that all the relevant information has been made available. Several people complained that Clang doesn’t provide enough context in certain circumstances, so we suspect that Clang diagnostics aren’t complete.

Accessibility

Accessibility concerns itself with ensuring that information is easily available for everyone. Some examples include making sure that documentation: can be read by screen readers, is scalable in size, translating to other natural languages, and using inclusive language.

Clang aims to be accessible by providing diagnostics with a way to be translated to other natural languages and by limiting lines to eighty columns where possible. Because Clang currently emits diagnostics with a terminal in mind, there’s not much more that it can do.

Aaron Ballman points out that LLVM documentation is reasonably accessible for the in-progress release, and that older versions of LLVM (e.g. LLVM 14!) become difficult to access. This isn’t a huge concern for this effort, but it would be nice if LLVM had ways to find version-specific information more easily.

Up-to-date

Documentation that’s up-to-date isn’t stale, and ensures that there’s no link rot. Documentation that is outdated is worse than useless. This is a maintenance issue. No complaints about diagnostics being out-of-date were observed and there aren’t any that we can recall either.

Improving the status quo

Instead of doing the traditional thing of writing diagnostics to the console as unstructured text, we could instead emit structured output; which will make the diagnostics machine parseable, substantially more powerful, and open the door for tooling to make diagnostics extensible. Clang is already incredibly robust thanks to its plug-and-play AST, and we can improve this by emitting machine-parsable diagnostics (making them plug-and-play too). At the time of writing, we consider there to be three main ways to present diagnostics to the user: either to the console (structure-independent), to an IDE, and as HTML (see below for a mock). These do not need to be the only ways to communicate diagnostics, and motivated users can take advantage of the structured output to write their own fantastic communication medium. Many respondents also requested that we provide some form of structured output.

We break console diagnostics into two modes: the first is the incumbent unstructured diagnostic model we currently use, and the second is structured output that can be either written to console or to file. This allows people who have Hyrum’s Law’d themselves into a situation to not have to change things overnight (or ever), and also allows for structured output to be available as raw text, which will be useful for machine parsing and a change of pace for readers. IDEs can consume this structured output to produce more intuitive diagnostics for their users based on the IDE structure. We drive the web-based approach below using the mock.

Structured output can be represented in multiple ways: we could use XML, JSON, YAML, or something domain-specific. We’ve chosen JSON for two main reasons: the first is because JSON is a way to structure data and has a limited set of interpretable types, and the vcpkg team from Microsoft expands on this fair bit. Using JSON allows Clang to emit the Static Analysis Results Interchange Format (SARIF)—which is an open standard based on JSON—to more easily communicate with other tools that can present the information in different ways for humans. The Clang static analyser already uses SARIF, so we may be able to leverage existing code into the mainstream compiler. Although SARIF is a good starting point, we may need to extend it to facilitate the described design. For example, it isn’t currently clear whether SARIF can represent a digestible context or links to documentation. Any extensions that we make in Clang should be proposed to SARIF so that there exists a canonical way for tooling to use these facilities (it’s conceivable that GCC or MSVC may attempt to follow suit once Clang has broken the mould).

A web-based approach

Below is a mock of an entirely redesigned approach to presenting errors to users outside of an IDE. It is entirely hypothetical and represents neither a finished product, nor does it represent the only solution.

There are several advantages to this approach. The first, and hopefully the most noticeable point is that everything has been grouped. We can only see the diagnostics for a source file when it has been selected. Diving a little deeper into this, we see that individual diagnostics and their components are also collapsible, introducing progressive disclosure. This means that relevant information becomes available only as the user requests it. This also offers the opportunity for us to coalesce repeated diagnostics. Condensing repeated diagnostics reduces the amount of noise for identical mistakes in source at different locations.

As with its predecessor, the problem is stated up-front in a user-digestible fashion. It then describes the problem in a way that feels as if a peer is collaborating with them, rather than a machine talking at them. Brand, who cites ‘How a computer should talk to people’ and ‘Compilers as Assistants’, emphasises that how a computer program’s message is worded affects how it is received by the human programmer. This is the motivation for a conversational tone in the reason section, as opposed to the formal tone that Clang currently employs. It also adopts the use of “we” and present tense like Flow, as noted in Brand’s section on other languages. There have traditionally been concerns about internationalisation in this situation: we should strive to make the user feel as if the implementer is talking with the programmer, rather than the implementation to maximise user experience, even if this means that we need to put more effort into ensuring that translations are just as good (i.e. we should not compromise internationalisation for the sake of English speakers’ UX, nor should we forsake anyone’s UX for the sake of easy internationalisation).

A list of resources are presented to the user: in this case, the user has opted to see both the cppreference documentation and N4860 (note they’re both hosted on llvm.org), for demonstration purposes. This is derived from both the Elm and Rust compilers, which provide messages that allow users to research more, if necessary. C++ is a language with a lot of sharp edges, and error messages aren’t always enough to diagnose a problem in obscure corner cases. For example, int const volatile& x = static_cast<int&&>(0); is ill-formed, but it’s fairly obscure as to why, given that int const& x = static_cast<int&&>(0); is allowed, and int const volatile& should be a “superset” of int const&, right? This sadly isn’t the case, and the compiler’s diagnostic doesn’t make that apparent, so links to dcl.init.ref#5.1.1 and dcl.init.ref#5.2 may be useful (this is an advanced thing to be doing, so links to standardese would likely be involved). We’ll discuss the feasibility of such an idea in the design section. Only a handful of respondents to our survey said that providing links to cppreference would impede their work and an overwhelming majority said that it would be at least somewhat helpful. Far fewer people agreed on providing the standardese always being helpful, but it still appears to have popular support (albeit ranked lower in priority with respect to cppreference).

Finally, context central to the diagnostics are presented in a more readable fashion. Providing context is always critical, but the way in which it is presented has received criticism from participants of our survey. A problem with the status quo is that Clang produces unstructured text. What we collected from our survey suggests that people would appreciate some form of structured data, although how that data is to be presented has far, far less consensus.

C++ is a complex language with many ways for programmers to encounter diagnostics. The design below outlines a new way to facilitate diagnostic presentation, but it does not make recommendations for replacing specific diagnostics.

Design

Adopting SARIF for structured output

We need to consider how diagnostics are output in order to provide users with the information they need to perform their duties in the most understandable format. How we choose to internally represent this has a lot of implications for user experience too. Unstructured text is great for humans using Clang on the command line (or somewhere that mirrors a command line), because it’s convenient for simple diagnostics such as a missing semicolon or quote. By appropriating the SARIF components of Clang’s static analyser into the Clang compiler, we can buy into a standard form of communication at a relatively low cost.

Due to the existing model being baked into C++ programming as we know it, the proposed design won’t replace the existing diagnostic engine, but will instead be an opt-in replacement (e.g. -femit-diagnostics-as=sarif -femit-diagnostics-to=/tmp/diagnostics). Once it is properly mature and possibly widely adopted, we might consider deprecating today’s status quo as the default and swap the defaults then after a few more releases, but it will take a lot of time to permanently remove the existing engine. The best way for us to fast-track returning to a single diagnostic engine is to create a new tool that consumes SARIF as input and emits the current diagnostics as output, but we are very far away from completing that at present.

As suggested above, we propose two flags: one to indicate how diagnostics are emitted (e.g. SARIF, unstructured, etc.), and one to indicate where diagnostics are written (e.g. as a path, to an IP address, stdout, stderr etc.). -femit-diagnostics-as value is unstructured. The default -femit-diagnostics-to value is stderr (if a user really wants to write to a file called stderr in the current working directory, they can use ./stderr).

This is a user-experience problem

As articulated in the motivation, we consider compiler diagnostics to be a form of documentation: they specifically document what doesn’t constitute a conforming program. This means that we need to prepare information for users that is audience-appropriate, clear, digestible, accurate, and widely-accessible. We intend to work with UX experts to determine what information is genuinely critical, what information is optional, and how this information can be presented by canonical LLVM tools. Given that SARIF is a standard format, users who are dissatisfied with our proposed canonical tools will be free to build alternatives to fit their needs or preferences.

We propose having at least the following information made available separately:

Diagnostic category	Error, warning, remark
Source location
Summary	Similar to current diagnostics, states what the problem is.
Reason	Explains why the diagnostic was generated in a friendly manner.
Context	Relevant source info such as considered overload resolution candidates, template backtraces, etc. These should be structured, rather than appearing as plaintext.
Potential fixes
Reference material	cppreference, C++ standard drafts, etc.
Glossary	Separates identifying type aliases and template parameters from the message.

By separating these out, we can build tools that can produce diagnostics with a digestible format that suits a user’s needs. For example, a web server might generate a web page that lists all the errors with their summaries by default, and then shows more detailed information by selecting a specific error. A server could also have the responsibility of de-duplicating repeated errors. The UX team that we consulted mentioned that syntax highlighting may be a benefit in error messages.

The summary is similar to non-note diagnostics that we currently receive, which helps to make the diagnostic fit for purpose. To help improve clarity, they should be user-oriented. This can mean changing diagnostics to present information from their perspective, rather than a compiler’s perspective. This may mean that we’ll need to change how we talk about certain things such as instantiation errors and overload resolution failure, which currently provide a technically accurate reason, but not necessarily an easily understandable one. The reason is a longer description that expands on the summary’s short description. This section attempts to help the user understand why their code is failing as best as possible, and relates the description back to their code. Its language should be inclusive and help the programmer feel as if the compiler is trying to help them, rather than hinder them. The reference material is a list of likely-relevant documentation that are hosted on LLVM-owned servers. While the reason section will provide an explanation of why things went wrong, the section is not the right spot for providing the user with reference-like information. Instead, we offer the most relevant links to cppreference (or equivalent) and the final draft for whichever C++ standard the user is compiling against, in the format of Draft C++ Standard: Contents. Our survey shows significantly more people finding cppreference documentation to be more useful to them, so we should prioritise this over C++ standard wording. This section is the most likely to go stale, and we will need to have a plan to ensure that the cppreference material remains up to date, as well as citing applied defect reports as necessary. The context region collects all the “noisy” bits, such as the list of overload candidates and template backtraces. Finally, the glossary is a collection of type aliases that are used elsewhere in the diagnostic. When the programmer uses std::string, we should be presenting std::string as often as possible, rather than presenting diagnostics that use std::basic_string<char> or string (where string = std::basic_string<char>).

One survey respondent suggested that we “burn compiler time in error cases to generate better diagnostics. Just don’t hit the fast path of correct code”. Despite C++ compile-times being painfully slow, we think this is a suggestion worth exploring, because it may drastically improve the quality of life for developers and result in getting them working binaries faster than messages with less context. We intend to work closely with UX experts to determine what is an acceptable performance hit. One approach may be to reduce the default maximum number of errors encountered, and abort once we reach that ceiling.

Clang web server

This proposal intends to establish a web server as one of the canonical diagnostic servers. This would allow a user to be presented with a radically different UI that would make it easier to organise information in a visual and collapsible format. The server need not be hosted on a machine actually on the web: one can self-host clang-web-diagnosticsd or host it on a network-local machine (this is in fact the encouraged model).

Due to the highly interactive nature of browser interfaces, this will require lots of consultation with UX engineers to ensure that the interface caters to as many community needs as possible.

Choice of language

The way in which we communicate with the user is critical to this effort’s success. How we choose to word things determines how our tools will be perceived. Clang currently prefers to be overly brief in its wording and be impersonal. We would prefer it if Clang instead provided as much information as possible to get the programmer on the right path and felt as if the compiler were a really caring university tutor instead of a blunt machine. We also think Flow’s approach of using first-person plural wording is worth considering, since it adds a personal touch to the compiler, and should hopefully make the programmer feel as if the compiler developer were talking to the programmer, rather than the compiler program.

Clang’s developer documentation says that it uses impersonal language to aid with translation. Our design uses personal and conversational language, and intend to provide translators with guidance on how we intend our message to be communicated to people: it is then at their discretion on how to word the diagnostics so that it’s culturally appropriate for the chosen locale.

Project legacy

While we will be driving the initial effort, we would like for there to be serious community buy-in.

That is: we’ll be contributing the structural changes and other tooling, but will need the community’s assistance in the more fine-grained things such as specific diagnostics, if we’re to complete this in a reasonable timeframe.

Alternative designs

Staying within the existing text framework

The original design for this project was to remain within the existing diagnostics framework that Clang has provided since forever. That is, the original design proposed making fundamental changes to how the front-end presented diagnostics, but still presented them to a text-based console. Because this proposal considers diagnostics to be a form of documentation, the proposal aimed to structure the text output in such a way that it became more digestible. The final result looked something like the following.

========== On source.cpp:15 ==========
------- Error summary -------
Invalid operands to binary expression ('int' and 's')
  15 |    0 + x;
     |    ~ ^ ~
     |    |   |
     |    int s

------- Reason -------
'0 + x' is invalid because we aren't able to find 'operator+(int, s)', nor can we find a compatible 'operator+' overload by converting the parameters.

------- Potential fixes -------
  - Add an overload for 'operator+(int, s)' (recommendation: make it a hidden friend in the body of 's')
  - Add the conversion operator 'operator s::int() const'

------- These cppreference pages that may provide insight -------
  - https://cppreference.llvm.org/w/cpp/language/operator_arithmetic
  - https://cppreference.llvm.org/w/cpp/language/operators
  - https://cppreference.llvm.org/w/cpp/language/overload_resolution

------- C++20 standard draft -------
  - https://c++20.llvm.org/stable-name-1
  - https://c++20.llvm.org/stable-name-2#1
  - https://c++20.llvm.org/stable-name-3

------- Context -------
If you'd like to see which overloads were considered, you can compile again with '-fdiagnostic-show-overload-candidates'.

While it improves digestibility, the change in diagnostics has significant drawbacks due to remaining in the console. Firstly, it substantially increases the overall amount of text that one has to read, and even though it’s now sectioned, the design has been overfitted for beginners due to its heavy exposition in reasoning and hiding of context by default. Worse, since important context is hidden by default, it makes the overall process worse for intermediate and expert programmers, since they need to recompile with new flags to get the context, which will invalidate build caches (and if they permanently leave them on, then hiding the context doesn’t reduce the verbosity at all).

The reason this design has been abandoned is outlined in the motivation: many of the ideas presented in this section aren’t suitable for a text-based terminal, and would be better off in a more interactive environment.

Open questions

Finally, we need to address the technical impact to the Clang codebase. Richard Smith notes that we haven’t addressed any of the following:

How will this be integrated into the Clang codebase?
How invasive will the changes be?
What are the criteria for determining if this experiment is successful?
How do we plan to maintain this if it’s successful?
How do we plan to roll this back if it’s not?
How much harder does this make it to implement new functionality requiring new diagnostics?

We intend to begin answering these questions within the next fortnight, and provide a preliminary report as a follow-up to this document.

efriedma-quic · May 17, 2022, 11:14pm

MSVC has a much simpler solution for presenting additional information about individual diagnostics: each diagnostic is assigned a unique identifier, and there’s a page in the manual corresponding to each identifier. Not exactly high-tech, but requires almost no software infrastructure. If we’re going to put in a bunch of effort to write additional documentation about each diagnostic, we should make sure that information is accessible even for users stuck looking at a traditional build log.

We do this to some extent already, particularly for typo correction.

cjdb · May 18, 2022, 1:05am

Thanks for your response

MSVC has a much simpler solution for presenting additional information about individual diagnostics: each diagnostic is assigned a unique identifier, and there’s a page in the manual corresponding to each identifier. Not exactly high-tech, but requires almost no software infrastructure.

This effort is about more than just providing more information. It’s also about:

structuring diagnostics so that they’re machine-parsable in a standard manner
presenting diagnostics in a friendly manner
presenting diagnostics in a readable manner
providing diagnostics that are user-friendly rather than compiler dev-friendly
provide a way to integrate with tooling

The MSVC approach can’t tick any of these boxes because it isn’t trying to solve any of these problems. That isn’t to say that the approach of offloading to a hosted webpage is bad, but it isn’t good for the situation because it’s solving something else.

If we’re going to put in a bunch of effort to write additional documentation about each diagnostic, we should make sure that information is accessible even for users stuck looking at a traditional build log.

This should be achievable by setting the flag -femit-diagnostics-as=sarif. That’s going to use SARIF format, but the output will still be to stderr, which is hopefully what you mean by a traditional build log.

Getting traditional unstructured text diagnostics to have this extra information will be difficult, because [Hryum’s Law] applies to compiler diagnostics. Per the mentioned alternative design, in the original draft, we learnt that there are assumptions baked into a lot of tests dependent on text in Clang diagnostics.

While typing this up, a colleague mentioned that they had a similar thought, but also noted that we’ll be in an awkward situation if we ever need to “split” a diagnostic. This plays into Hyrum’s Law too, but I’m not sure how big of an issue it would be.

pogo59 · May 18, 2022, 12:00pm

Are there any worked examples of translated diagnostics? If not, then this goal hasn’t been validated.

The design of Clang diagnostics is not particularly friendly to translators, as there are a fair number of places where English text to be inserted into the message is hard-coded in the source, rather than in the diagnostic definition files.

This tactic also allows offloading the diagnostic text to a separate dictionary, which (a) reduces the size of the compiler, (b) allows multiple dictionaries in different languages, which can be selected at runtime rather than at compiler-build time. Technically, Clang already assigns an enum to each diagnostic; however, the numeric value is not stable, which is a prerequisite for exposing the numeric values to users.

AaronBallman · May 18, 2022, 12:12pm

We’re aware, but that doesn’t mean we don’t still have the goal of making translations possible. We document this in a few places:

https://clang.llvm.org/docs/InternalsManual.html#adding-translations-to-clang
“Clang” CFE Internals Manual — Clang 18.0.0git documentation (see the “select” format paragraph)

tschuett · May 18, 2022, 12:26pm

The Swift compiler gives in some cases really detailed information about concepts:
https://github.com/apple/swift/tree/main/userdocs/diagnostics

JVApen · May 19, 2022, 8:03am

If I remember well, there exists a translated branch of GCC (i found GCC in French Translation, not sure if that’s the right one)

I can tell that coming from a country where English ain’t an official language, a way to have translations could be useful in teaching children to program. Not sure if C++ is the right language for that, though it’s a general issue for compliers and interpreters that everything is in English.

banach-space · May 19, 2022, 6:09pm

Thank you for working on this!

Your proposal focuses on the user-facing aspects of the diagnostics. Like Richard, I’m curious about the implementation details that will affect Clang developers:

Looking forward to the follow-up

Also,

I think that there’s going to be a lot work required to update and to document the diagnostics with relevant links or better wording. Doing that shouldn’t require much compiler (or Clang) knowledge. Perhaps it’s worth reaching out to other communities (e.g. C++ or Objective-C) that could be interested in contributing?

Just my 2p,

-Andrzej

efriedma-quic · May 19, 2022, 6:46pm

Not trying to say you’re not solving a real issue here. I just want to avoid a situation where we write a bunch of additional information about reasons/standards references/etc., and none of that is easily accessible unless you’re using an IDE with SARIF support.

This doesn’t really work if you’re trying to, for example, figure out why your buildbot failed. Unless you’re suggesting that users should pass -femit-diagnostics-as=sarif to all their builds, which might have other issues…

The stable part of traditional diagnostics doesn’t really involve the actual text; it’s mostly just the prefixes. For example, an error starts with “error:”, and is followed by some number of “note:”. We should be able to stuff additional information into notes if we want to, without any significant breakage. I’ve never seen any issue where changing the text of a diagnostic caused regressions. (This is ignoring issues with clang’s regression tests; we can mess with those if we need to.)

tschuett · May 19, 2022, 7:20pm

Wow. Thank you!
My compilers are doing a lot of ASCII art and then you showed my HTML. Wow.
I don’t like that clang.llvm.org does not host something between reference or tutorial material.
Will there be a way to provide feedback? If you don’t like the diagnostics, file an issue?
It is probably not possible to show different diagnostics to different audiences. For ISO C++ committee members I want different diagnostics than for beginners.

cjdb · May 19, 2022, 8:18pm

Agreed. This is quite a ways down the pipeline, so I’m not too motivated to do this right now, but it sounds like an avenue worth exploring in the long-term.

I was, but I think you’ve raised a good point about buildbots and SARIF potentially being problematic too. When I raised this with Richard yesterday, he expressed concern that Clang error codes might be distinct from Clang forks’ error codes (e.g. AppleClang). I see a few ways around this:

Have Clang prefix error codes with clang- and then get forks to have fork_name- as their prefix. I’m not sure if forks would prefer to curate their own database or just use the canonical LLVM one, which poses problems.
Replace error codes with *.td identifiers (e.g. https://diagnostics.llvm.org/err_operator_overload_post_incdec_must_be_int). It’ll be more text, but it might also be more likely to remain consistent between forks.
Put the diagnostic explanations in the compiler, similar to how rustc does. Then folks can do clang++ -fexplain-${WHICHEVER_IDENTIFIER_WE_SETTLE_ON}.

Agreed. Providing feedback was something my survey and surrounding discussions surfaced a fair bit, so it’s apparent that users want it. I think for our diagnostics to be truly user-centric, we need to eventually facilitate this.

From a technical standpoint, it’s I’m confident about it being possible, but it wouldn’t be the responsibility of Clang. That’s the responsibility of whichever tool consumes the SARIF.

The real challenge lies in working out how to express diagnostics that fit into discrete categories such as “beginner”, “intermediate”, “advanced”, “expert”, “implementer”.

efriedma-quic · May 19, 2022, 8:33pm

In addition to worrying about clang vs. forks, we need to worry about clang-14 vs. clang-15.

I think if we use “codes”, we’d have to actually embed the numbers into the *.td files. Autogenerating them would make it hard to keep them stable. (We don’t want clang-14’s ERR1234 to mean something different from clang-15’s ERR1234.)

I don’t have a strong opinion on codes vs. names, but codes are probably easier to type.

tschuett · May 19, 2022, 9:52pm

Did you ever consider to keep like a database of the diagnostics? Then you can tell the user: you have constantly problems with lambdas; here you will find some help.

jwakely · May 20, 2022, 12:20pm

Surely this is already the case, isn’t it? It’s certainly standard operating procedure for G++. If we’re exiting with an error, we don’t care about the additional time to do extra name lookups, calculate Levenshtein distances, find suggestions for missing headers etc.

Despite C++ compile-times being painfully slow, we think this is a suggestion worth exploring, because it may drastically improve the quality of life for developers and result in getting them working binaries faster than messages with less context. We intend to work closely with UX experts to determine what is an acceptable performance hit.

Unless you’re considering using cloud-based machine learning to analyze the errors, I doubt anything you do will be so slow it’s unacceptable. Users want builds to finish as fast as possible, but if the build fails with an error, you shift from GHz speeds to human speeds. Taking an extra second to produce excellent diagnostics is not noticeable if the developer has to notice the build failed, switch focus to the terminal or window showing the build output, scroll to the first error, start reading it etc. And that’s before even considering the time to interpret and understand the error. If a better error helps the user understand the error in 10 seconds instead of 20, they’re not going to care that it took 2s to show the better error. And 2s is a loooong time for the compiler to do work.

jwakely · May 20, 2022, 12:25pm

There’s no branch, the upstream GCC does diagnostic translation for multiple languages by default. Every diagnostic string is extracted from the source and can be replaced by a translated version at run-time. The current French translations are in gcc/po/fr.po

See GCC and the Translation Project for more info.

cjdb · May 20, 2022, 4:29pm

As in preserving prior diagnostics a user got? That’s an interesting idea.

My interpretation of this comment is that that particular respondent feels Clang still has a long way to go on this (we certainly don’t have parity with GCC).

tschuett · May 20, 2022, 4:48pm

Exactly. There is probably more information that you can learn about the user over several clang invocations.

cjdb · May 20, 2022, 5:45pm

It’s definitely not a Clang responsibility, but incorporating it into one of the canonical post-processing tools would be pretty rad.

tschuett · May 20, 2022, 6:35pm

I am mostly interested in the person sitting in front of my screen. You have probably access to a fleet of users.

Xazax-hun · May 20, 2022, 9:14pm

Clang Static Analyzer and Clang has two disjoint diagnostic engines. I believe @NoQ looked into some unifications in the past, not sure what is the current state. I think this effort might be another reason why to continue pursuing this goal. There would be many advantages of the unification:

Clang Tidy could emit identical warning messages to the Clang Static Analyzer when it is invoking its checks
Clang itself could emit path traces for warnings, and path trace rendering could be shared between the components
All components would benefit from the features implemented in the common parts

Some components of MSVC, like /analyze, can already produce SARIF output. The MSVC team is already aware of this proposal, moreover, there have been asks from internal users at Microsoft to add SARIF support to all compiler diagnostics.

There are some web based tools for some static analysis tools like CodeChecker. One important feature of CodeChecker is the ability to diff/baseline results. Baselining can be a useful feature to roll out warnings for large code bases when there are little resources to fix all instances of the warnings in bulk. But baselines can be used to prevent introducing new instances of those warnings.

I supposed we do not want to add networking code to clang. Sending warnings over the network makes sense but I’d prefer such code to live outside of the compiler.

Is the idea to generate output dynamically, on-demand from the SARIF files as opposed to rendering static HTMLs?

I am not sure if this was ever a problem with compiler warnings, but checkers in the Clang Static Analyzer and Clang Tidy can occasionally get renamed (sometimes as part of moving them to a different package/group). Having stable IDs for the diagnostics could help avoid breaking users when such changes happen.

Alternatively, when the error codes are numeric, certain ranges can be assigned to certain forks. Although that requires a bit more coordination.

Topic		Replies	Views
[RFC][GSOC 2024] - Improve Clang Diagnostics GSoC clang , gsoc2024	22	730	February 29, 2024
Diagnostic Improvements Clang Frontend	9	94	November 18, 2008
Improve Clang Diagnostics GSoC clang	9	1086	February 24, 2023
RFC: c++ diagnostics Clang Frontend	2	120	April 5, 2010
[RFC] Documentation of Clang diagnostics ... an automated approach Clang Frontend clang	16	335	April 4, 2024