Syntax highlighting of keywords

Hi clangd-dev,

I was wondering whether there’s any value in the highlighting for “primitive” types being added on keywords like ‘void’, ‘int’, etc?

  • Any editor handles those with default syntax highlight.
  • It’s trivial and does not require actual semantic knowledge.

With that in mind, I wonder whether we should drop this completely and just let the editors handle the keywords?
What do people think?

Yeah. I can see this being useful if lexing got more complicated (e.g. if the committee added string interpolation).

But as it stands, I think we should be doing the simplest thing if performance doesn’t matter, and the minimal thing if it does.

Alternately we could highlight everything (including braces and such) to allow editors without local highlighting. But I doubt they’ll ever exist, the flickering would be so bad.

I agree editors handle keyword highlightings very well, but they don’t have sufficient knowledge to highlight typealias/typedefs for primitive types.

// We want to highlight A, B as primitive types.

using A = int;
typedef int B;

Highlighting typedefs to primitive types in a special manner seems ok, I was referring specifically to highlightings of keywords.

So unless anyone objects, I’ll send a patch to remove highlightings of keywords. Typedefs to simple types will still be highlighted with ‘primitive’ highlighting type.

It would be useful if clangd highlighted context-sensitive keywords, such as "override" and "final" (and, in C++20, likely "import" and "module"). Because of their context-sensitivity, clients can't highlight those as easily as hard keywords.

They’re rare as identifiers, though, so I don’t see why editors wouldn’t just always highlight them as keywords.

‘import’ and ‘module’ could probably be easily detected too: they’ll probably be on the start of the line (haven’t seen the actual proposed syntax, though, happy to be corrected)

Does this affect performance in some meaningful way? If not, I don’t understand the motivation to drop primitive keyword highlighting.

I guess it depends how much one values accuracy :slight_smile: In Eclipse, for example, we decided we did not want to keyword-highlight "override" as "final" if they occur as identifiers, even if that's rare, so we added them to the semantic coloring engine rather than the lexical one. In the LSP model, that would be a server-side highlighting.

Frankly I doubt `final` is tremendously rare as an identifier; it seems similar to `result` as a catch-all variable name.

Keywords highlightings (provided by the editors) are definitely much faster than AST-based ones (which can take minutes in bad cases).
Editors highlightings are also much more reliable, i.e. they never “blink”.

IMO one should definitely prefer the editor-based implementation when possible. The semantic highlightings are always much more “sluggish”.

Given that, I think we should prefer to not interfere with the editor highlightings if we can.

I think we should prefer the failure mode of “editor highlights override as a keyword and clangd will re-highlight as a variable eventually” to “editor does not highlight override until clangd produces highlighting”.
But I can see how other people can have different opinions here.

Not highlighting ‘int’ and ‘void’ in clangd is probably a no-brainer, though.

Just checking: are we talking about primitive highlighting (i.e. keywords only)? I had several problems with C++ keywords from new language standards because they are not in the default syntax files of (Neo)Vim and third-party plugins often miss some of the words. I think it makes sense to have Clangd as the “universal provider” of the updated keyword list because then I don’t have to patch Vim plugins and/or wait for the default syntax files to get updated.

We’re talking exclusively about semantic highlightings (e.g. the highlightings that require full semantic analysis).
Clangd could potentially provide keyword highlightings as well, with different latency trade-offs.

I still feel like getting it right in the editors shouldn’t be too big of a problem, although some things like raw string literals obviously pose a challenge.
Doing it in the editor provides a much more robust experience, e.g. imagine losing all syntax highlighting because clangd crashed.

Why is this a problem if clangd provides extra information, like keywords highlighting? Couldn’t IDEs/Editors (any LSP clients) ignore the information, if they have more efficient ways to highlight the keywords? For consistency and completeness, it seems worth providing the information.

Having this in clangd and having editors ignore it complicates both clangd and the editors.
If keyword highlightings is something that the editors are doing anyway, I don’t see how also providing them in clangd makes anything less complicated.

Also, please note clangd never provided keyword highlightings. We were highlighting some keywords that referred to builtin types, e.g. int, void, etc. And we never tried to highlight other keywords like ‘final’ and ‘override’.