Landing support for LSP protocol extensions?

Hi there!

I've written a patch [1] to add support to clangd for the "textDocument/superTypes" request, which, given the location of a symbol naming a class type, returns information about its base classes.

This is not part of the LSP specification yet, though it has been proposed for addition [2], and has been implemented in at least one client (Theia [3], not merged yet).

(There is also a "textDocument/subTypes" request in the proposal, which returns information about derived classes. That's more challenging to implement; stay tuned for a separate post about that.)

Is clangd interested in support for protocol extensions like this landing, in any form, prior to their standardization in LSP?

If so, are there any measures that should be taken to reflect that the protocol is not yet standard? For example:

* cquery (another C++ language server [4]) puts non-standard protocols into a "$cquery" namespace, e.g. "$cquery/typeHierarchy" rather than "textDocument/typeHierarchy". clangd could potentially do the same.

* The protocol could be available under the standard name, but behind a command-line flag and not enabled by default.

Any thoughts on this subject are appreciated!

Thanks,
Nate

[1] https://reviews.llvm.org/D56370
[2] https://github.com/Microsoft/vscode-languageserver-node/pull/426
[3] https://github.com/theia-ide/theia/pull/3802
[4] https://github.com/cquery-project/cquery

Hi Nathan,

I don’t think we have a clear idea/policy for extensions, but we should.
I think the most important factors (for all features, standard or not):

  • editor support: how many users will the feature be available to?
  • standardization: is this feature fairly stable, and likely to be adopted by more editors over time?
  • utility: does the feature add a lot of value for users?
  • complexity: is this feature hard to implement in clangd, or constrain future work? does it complicate the protocol a lot?

Regarding supertypes: editor support and standardization are both at a very early stage. On the other hand, it’s a useful feature that’s easy to implement[1]. I wouldn’t be opposed as-is.
One alternative to consider is implementing it as an extension to FindReferences (the supertype → subtype relationship is a type of reference.) This might be easier to standardize, as it seems giving a ref-type mask to FindReferences would enable a bunch of these types of features (and editors could reuse UI with little effort). It’s unclear to me how important showing the whole hierachy is.

Subtypes on the other hand are complex to implement and have costs (increased index size).

Regarding hiding extensions:

  • one concern is stability: we may want to expose features while reserving the right to remove or totally change them later. This isn’t intrinsically tied to standardization - we can commit to supporting extensions. Flags/capabilities etc might make sense here.
  • another is namespace conflicts between our extension and future standard/other extensions. The problem with a clangd namespace is then editors need to know about clangd, not just the extension in general. I’d suggest trying to adopt/encourage shared names for extensions among the LSP community rather than namespacing, myself.

[1] Assuming an AST-only implementation that will fail to find supertypes when the current file sees only a class’s forward declaration.

Hi Sam,

Thanks for your feedback!

Regarding supertypes: editor support and standardization are both at
a very early stage. On the other hand, it's a useful feature that's easy to
implement[1]

[1] Assuming an AST-only implementation that will fail to find
supertypes when the current file sees only a class's forward declaration.

Indeed, that is how my current implementation works.

From a user point of view, it would be nice if it worked on types that are only forward-declared; once we figure out subtypes, we can probably extend supertypes to apply the same approach.

One alternative to consider is implementing it as an extension to
FindReferences (the supertype -> subtype relationship is a type of
reference.) This might be easier to standardize, as it seems giving a
ref-type mask to FindReferences would enable a bunch of these types
of features (and editors could reuse UI with little effort). It's unclear to
me how important showing the whole hierachy is.

I can see modelling subtypes as a type of reference. It's less clear for supertypes: if I have a class with five bases, I want to report five results in my query, but my class's name is only mentioned once.

Even for subtypes, this implementation approach does not seem sufficient for the current proposed API. Annotating references as bases gives me the locations of base-clauses, but the API wants the name of the derived type. I don't think we can reliably get from one to the other without building an AST for the file in question.

As far as the API itself is concerned, there is some logic in formulating the API in terms of references. I believe cquery supports extensions that return subsets of references which are calls, bases, etc. However, it does this in addition to supporting a type hierarchy query, and a type hierarchy view does seem to be popular in editors. (Just among the editors I know, Eclipse, VSCode, and IntelliJ all have one, with Theia soon to follow.)

Regarding hiding extensions:
[...]
I'd suggest trying to adopt/encourage shared names for extensions
among the LSP community rather than namespacing, myself.

That sounds reasonable.

It sounds like, for the time being, I should go ahead and submit my current AST-only superTypes implementation for review, using the proposed standard name, and see what happens. I'll clean up the patch and do so.

Thanks,
Nate

Hi Sam,

Thanks for your feedback!

Regarding supertypes: editor support and standardization are both at
a very early stage. On the other hand, it’s a useful feature that’s easy to
implement[1]

[1] Assuming an AST-only implementation that will fail to find
supertypes when the current file sees only a class’s forward declaration.

Indeed, that is how my current implementation works.

From a user point of view, it would be nice if it worked on types that are only forward-declared; once we figure out subtypes, we can probably extend supertypes to apply the same approach.

One alternative to consider is implementing it as an extension to
FindReferences (the supertype → subtype relationship is a type of
reference.) This might be easier to standardize, as it seems giving a
ref-type mask to FindReferences would enable a bunch of these types
of features (and editors could reuse UI with little effort). It’s unclear to
me how important showing the whole hierachy is.

I can see modelling subtypes as a type of reference. It’s less clear for supertypes: if I have a class with five bases, I want to report five results in my query, but my class’s name is only mentioned once.

Good point - the references wouldn’t mention the original class name, which is confusing.
Thinking about the physical layout of source code, to find (direct) bases today one can just go-to-def and then see the list.
So a reference-style list of bases doesn’t add much value (though a hierarchical one still might).

And thinking more about the interaction - it seems that a type hierarchy that shows both base and derived classes might be most useful. Do you know why the super/sub types are proposed as separate methods?

Even for subtypes, this implementation approach does not seem sufficient for the current proposed API. Annotating references as bases gives me the locations of base-clauses, but the API wants the name of the derived type. I don’t think we can reliably get from one to the other without building an AST for the file in question.

Sorry, I did mean here exposing (an extension of) the reference API, rather than the suggested supertypes API. Not sure it’s a good idea, but it’d be a different model.

As far as the API itself is concerned, there is some logic in formulating the API in terms of references. I believe cquery supports extensions that return subsets of references which are calls, bases, etc. However, it does this in addition to supporting a type hierarchy query, and a type hierarchy view does seem to be popular in editors. (Just among the editors I know, Eclipse, VSCode, and IntelliJ all have one, with Theia soon to follow.)

VSCode in particular seems to get all its structure-awareness from language servers. Do you know where it gets data for type hierarchy? Practically speaking, what VSCode wants seems to drive about 80% of what gets standardized.

Regarding hiding extensions:
[…]
I’d suggest trying to adopt/encourage shared names for extensions
among the LSP community rather than namespacing, myself.

That sounds reasonable.

It sounds like, for the time being, I should go ahead and submit my current AST-only superTypes implementation for review, using the proposed standard name, and see what happens. I’ll clean up the patch and do so.

Sounds good, thanks!

And thinking more about the interaction - it seems that a type hierarchy
that shows both base and derived classes might be most useful. Do
you know why the super/sub types are proposed as separate methods?

All the UI's I've seen indeed show a single hierarchy.

I think the super/sub separation is mostly just a convenience at the protocol level. It makes it easier to query the tree lazily (one level at a time), and also to reuse DocumentSymbol's "children" field to represent the tree.

I don't think there's any conflict between the protocol having two parts, and the UI showing a single tree.

VSCode in particular seems to get all its structure-awareness from
language servers. Do you know where it gets data for type hierarchy?
Practically speaking, what VSCode wants seems to drive about 80% of
what gets standardized.

I had another look, and it looks like the VSCode type hierarchy UI that I was thinking of comes from the vscode-cquery extension [1], which uses a cquery protocol extension to get the data from the server. (I believe the cquery feature predates the standard protocol proposal.)

VSCode itself does not appear to have a generic type hierarchy UI yet; I expect it will get one as standardization of the protocol progresses.

Thanks,
Nate

[1] https://github.com/cquery-project/vscode-cquery/blob/master/src/inheritanceHierarchy.ts

It sounds like, for the time being, I should go ahead and submit my current
AST-only superTypes implementation for review, using the proposed
standard name, and see what happens. I'll clean up the patch and do so.

Sounds good, thanks!

The patch is now posted here: https://reviews.llvm.org/D56370

Thanks,
Nate