Clangd explicit and implicit doc open

As per previous note, I implemented what I call “implicit document opening” for clangd, where the client starts requesting language features without an explicit open request. There is nothing in the LSP protocol to require an open request in order to handle language features. Currently, clangd doesn’t allow language features if the file hasn’t been opened explicitly. It also keeps track of any opened/closed docs, releasing closed docs to make room for an open. The following is a patch of the changes for it, so please look it over, comment. If positive, I can then make a patch request the “normal” way. Several things to note:

  • The master branch has a bunch of changes to clangd that unfortunately causes VS to completely shutdown the server (client expects a bool for codeActionProvider, gets back a CodeActionOptions struct, doesn’t like that). So, in order to test, the changes here are against branch 9.x, but should applied against master in the patch request.

  • The code globs together DraftStore.addDraft() and ClangServer.addDocument() and corresponding removing methods in ClangdLSPServer. There’s no explicit class wrapping the ref count of opens/closes, rather just in ClangLSPServer. See the floowing issue on DraftStore, and why I think it could be removed.

  • On an “implicit open”, the file is read from disk. On “explicit open”, the file is passed to the client from server.

  • I’m not sure why there are duplicate copies of the file in memory (one in DraftStore, which is used for computing whether changes sync, and ClangServer). As far as I can tell, only one copy is actually needed in ClangServer. Instead, the changes from client are applied to the current doc, then that is passed to ClangServer.addDocument(). I don’t understand why the changes itself aren’t passed to ClangServer for efficient incremental parsing and recomputing indexes.

  • I added a test for the change: test/didopen-not-required.test. I noticed several other tests in clangd/t3est that fail. Some tests that hardwire the path. There are tests that do not clean up temporary files used in the test.

  • The ClangdLSPServer.IsOpen map keeps track of what was opened (implicit or explicit). The code keeps ClangdLSPServer.MaxCacheSize open or closed documents in memory, unless the client makes further open request. It releases a closed doc if there is an open request and already MaxCacheSize, first-in-list swap out. If it there are MaxCacheSize docs opened, then it doubles the size of the capacity.

  • When updates are received from the client, it goes about checking whether the document makes sense, presumably when the client sends a bogus computed diff list to apply. Does this occur often? Yikes. When a bogus set of changes is detected from the client, the server goes into a “resync” mode, waiting until a close and explicit open is performed.


diff --git a/clang-tools-extra/clangd/ClangdLSPServer.cpp b/clang-tools-extra/clangd/ClangdLSPServer.cpp
index 5f8b307f721..5be9f5d932c 100644
--- a/clang-tools-extra/clangd/ClangdLSPServer.cpp
+++ b/clang-tools-extra/clangd/ClangdLSPServer.cpp
@@ -25,6 +25,7 @@
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/ScopedPrinter.h"
+#include <fstream>

 namespace clang {
 namespace clangd {
@@ -452,29 +453,26 @@ void ClangdLSPServer::onSync(const NoParams &Params,
 void ClangdLSPServer::onDocumentDidOpen(
     const DidOpenTextDocumentParams &Params) {
   PathRef File = Params.textDocument.uri.file();
-
   const std::string &Contents = Params.textDocument.text;
-
-  DraftMgr.addDraft(File, Contents);
-  Server->addDocument(File, Contents, WantDiagnostics::Yes);
+  trackOpenedDocument(File, Contents);
 }

 void ClangdLSPServer::onDocumentDidChange(
     const DidChangeTextDocumentParams &Params) {
+  PathRef File = Params.textDocument.uri.file();
+  trackUnopenedDocument(File);
   auto WantDiags = WantDiagnostics::Auto;
   if (Params.wantDiagnostics.hasValue())
     WantDiags = Params.wantDiagnostics.getValue() ? WantDiagnostics::Yes
                           : WantDiagnostics::No;
-
-  PathRef File = Params.textDocument.uri.file();
   llvm::Expected<std::string> Contents =
       DraftMgr.updateDraft(File, Params.contentChanges);
   if (!Contents) {
     // If this fails, we are most likely going to be not in sync anymore with
     // the client.  It is better to remove the draft and let further operations
     // fail rather than giving wrong results.
-    DraftMgr.removeDraft(File);
-    Server->removeDocument(File);
+    RequiresExplicitClientOpen.insert(File); // Make sure this file isn't retracked until we resync properly.
+    untrackDocument(File);
     elog("Failed to update {0}: {1}", File, Contents.takeError());
     return;
   }
@@ -594,9 +592,7 @@ void ClangdLSPServer::onRename(const RenameParams &Params,
 void ClangdLSPServer::onDocumentDidClose(
     const DidCloseTextDocumentParams &Params) {
   PathRef File = Params.textDocument.uri.file();
-  DraftMgr.removeDraft(File);
-  Server->removeDocument(File);
-
+  untrackDocument(File);
   {
     std::lock_guard<std::mutex> Lock(FixItsMutex);
     FixItsMap.erase(File);
@@ -613,6 +609,7 @@ void ClangdLSPServer::onDocumentOnTypeFormatting(
     const DocumentOnTypeFormattingParams &Params,
     Callback<std::vector<TextEdit>> Reply) {
   auto File = Params.textDocument.uri.file();
+  trackUnopenedDocument(File);
   auto Code = DraftMgr.getDraft(File);
   if (!Code)
     return Reply(llvm::make_error<LSPError>(
@@ -626,6 +623,7 @@ void ClangdLSPServer::onDocumentRangeFormatting(
     const DocumentRangeFormattingParams &Params,
     Callback<std::vector<TextEdit>> Reply) {
   auto File = Params.textDocument.uri.file();
+  trackUnopenedDocument(File);
   auto Code = DraftMgr.getDraft(File);
   if (!Code)
     return Reply(llvm::make_error<LSPError>(
@@ -643,6 +641,7 @@ void ClangdLSPServer::onDocumentFormatting(
     const DocumentFormattingParams &Params,
     Callback<std::vector<TextEdit>> Reply) {
   auto File = Params.textDocument.uri.file();
+  trackUnopenedDocument(File);
   auto Code = DraftMgr.getDraft(File);
   if (!Code)
     return Reply(llvm::make_error<LSPError>(
@@ -685,6 +684,7 @@ flattenSymbolHierarchy(llvm::ArrayRef<DocumentSymbol> Symbols,

 void ClangdLSPServer::onDocumentSymbol(const DocumentSymbolParams &Params,
                    Callback<llvm::json::Value> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   URIForFile FileURI = Params.textDocument.uri;
   Server->documentSymbols(
       Params.textDocument.uri.file(),
@@ -723,6 +723,7 @@ static llvm::Optional<Command> asCommand(const CodeAction &Action) {
 void ClangdLSPServer::onCodeAction(const CodeActionParams &Params,
                    Callback<llvm::json::Value> Reply) {
   URIForFile File = Params.textDocument.uri;
+  trackUnopenedDocument(File.file());
   auto Code = DraftMgr.getDraft(File.file());
   if (!Code)
     return Reply(llvm::make_error<LSPError>(
@@ -767,6 +768,7 @@ void ClangdLSPServer::onCodeAction(const CodeActionParams &Params,

 void ClangdLSPServer::onCompletion(const CompletionParams &Params,
                    Callback<CompletionList> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   if (!shouldRunCompletion(Params)) {
     // Clients sometimes auto-trigger completions in undesired places (e.g.
     // 'a >^ '), we return empty results in those cases.
@@ -794,6 +796,7 @@ void ClangdLSPServer::onCompletion(const CompletionParams &Params,

 void ClangdLSPServer::onSignatureHelp(const TextDocumentPositionParams &Params,
                       Callback<SignatureHelp> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->signatureHelp(Params.textDocument.uri.file(), Params.position,
             Bind(
                 [this](decltype(Reply) Reply,
@@ -835,6 +838,7 @@ static Location *getToggle(const TextDocumentPositionParams &Point,

 void ClangdLSPServer::onGoToDefinition(const TextDocumentPositionParams &Params,
                    Callback<std::vector<Location>> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->locateSymbolAt(
       Params.textDocument.uri.file(), Params.position,
       Bind(
@@ -856,6 +860,7 @@ void ClangdLSPServer::onGoToDefinition(const TextDocumentPositionParams &Params,
 void ClangdLSPServer::onGoToDeclaration(
     const TextDocumentPositionParams &Params,
     Callback<std::vector<Location>> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->locateSymbolAt(
       Params.textDocument.uri.file(), Params.position,
       Bind(
@@ -886,12 +891,14 @@ void ClangdLSPServer::onSwitchSourceHeader(
 void ClangdLSPServer::onDocumentHighlight(
     const TextDocumentPositionParams &Params,
     Callback<std::vector<DocumentHighlight>> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->findDocumentHighlights(Params.textDocument.uri.file(),
                  Params.position, std::move(Reply));
 }

 void ClangdLSPServer::onHover(const TextDocumentPositionParams &Params,
                   Callback<llvm::Optional<Hover>> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->findHover(Params.textDocument.uri.file(), Params.position,
             Bind(
             [this](decltype(Reply) Reply,
@@ -978,12 +985,14 @@ void ClangdLSPServer::onChangeConfiguration(

 void ClangdLSPServer::onReference(const ReferenceParams &Params,
                   Callback<std::vector<Location>> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->findReferences(Params.textDocument.uri.file(), Params.position,
              CCOpts.Limit, std::move(Reply));
 }

 void ClangdLSPServer::onSymbolInfo(const TextDocumentPositionParams &Params,
                    Callback<std::vector<SymbolDetails>> Reply) {
+  trackUnopenedDocument(Params.textDocument.uri.file());
   Server->symbolInfo(Params.textDocument.uri.file(), Params.position,
              std::move(Reply));
 }
@@ -1147,5 +1156,79 @@ void ClangdLSPServer::reparseOpenedFiles() {
             WantDiagnostics::Auto);
 }

+void ClangdLSPServer::trackOpenedDocument(PathRef File, StringRef Contents) {
+  RequiresExplicitClientOpen.erase(File);
+  auto find = IsOpen.find(File);
+  auto is_open = find != IsOpen.end() ? find->second : false;
+  if (IsOpen.size() >= MaxCacheSize && !is_open) {
+    bool found = false;
+    std::map<std::string, bool>::iterator it;
+    for (it=IsOpen.begin(); it!=IsOpen.end(); ++it) {
+      if (! it->second) {
+        found = true;
+        break;
+      }
+    }
+    if (found) {
+      IsOpen.erase(File);
+      DraftMgr.removeDraft(it->first);
+      Server->removeDocument(it->first);
+    }
+    else {
+      MaxCacheSize *= 2;
+    }
+  }
+  if (is_open) {
+    // Lovely... File read from disk and here comes the client with a
+    // copy. Erase current copy if different, and redo.
+    auto contents = DraftMgr.getDraft(File);
+    std::string draft = contents.getValue();
+    std::string passed = Contents;
+    if ((!contents) && draft != passed) {
+      DraftMgr.addDraft(File, Contents);
+      Server->addDocument(File, Contents, WantDiagnostics::Yes);
+    }
+  }
+  IsOpen[File] = true;
+  DraftMgr.addDraft(File, Contents);
+  Server->addDocument(File, Contents, WantDiagnostics::Yes);
+}
+
+void ClangdLSPServer::trackUnopenedDocument(PathRef File) {
+  if (DraftMgr.getDraft(File) == llvm::None && RequiresExplicitClientOpen.find(File) == RequiresExplicitClientOpen.end()) {
+    std::ifstream ifs(File.data());
+    std::string contents((std::istreambuf_iterator<char>(ifs)),(std::istreambuf_iterator<char>()));
+    auto find = IsOpen.find(File);
+    auto is_open = find != IsOpen.end() ? find->second : false;
+    if (IsOpen.size() >= MaxCacheSize && !is_open) {
+      bool found = false;
+      std::map<std::string, bool>::iterator it;
+      for (it=IsOpen.begin(); it!=IsOpen.end(); ++it) {
+        if (! it->second) {
+          found = true;
+          break;
+        }
+      }
+      if (found) {
+        IsOpen.erase(File);
+        DraftMgr.removeDraft(it->first);
+        Server->removeDocument(it->first);
+      }
+      else {
+        MaxCacheSize *= 2;
+      }
+    }
+    IsOpen[File] = true;
+    DraftMgr.addDraft(File, contents);
+    Server->addDocument(File, contents, WantDiagnostics::Yes);
+  }
+}
+
+void ClangdLSPServer::untrackDocument(PathRef File) {
+  auto find = IsOpen.find(File);
+  if (find == IsOpen.end()) return;
+  IsOpen[File] = false;
+}
+
 } // namespace clangd
 } // namespace clang
diff --git a/clang-tools-extra/clangd/ClangdLSPServer.h b/clang-tools-extra/clangd/ClangdLSPServer.h
index 1c37750a185..b1a7abfe6d3 100644
--- a/clang-tools-extra/clangd/ClangdLSPServer.h
+++ b/clang-tools-extra/clangd/ClangdLSPServer.h
@@ -20,6 +20,7 @@
 #include "clang/Tooling/Core/Replacement.h"
 #include "llvm/ADT/Optional.h"
 #include <memory>
+#include <set>

 namespace clang {
 namespace clangd {
@@ -127,6 +128,14 @@ private:
   void publishDiagnostics(const URIForFile &File,
               std::vector<clangd::Diagnostic> Diagnostics);

+  /// Add tracking of a document for file open from client.
+  void trackOpenedDocument(PathRef File, StringRef Contents);
+  /// Add tracking if a document has not be opened from client.
+  void trackUnopenedDocument(PathRef File);
+  /// Decrement use count, and remove if no uses and cache should be
+  /// cleared.
+  void untrackDocument(PathRef File);
+
   /// Used to indicate that the 'shutdown' request was received from the
   /// Language Server client.
   bool ShutdownRequestReceived = false;
@@ -180,6 +189,13 @@ private:
   ClangdServer::Options ClangdServerOpts;
   llvm::Optional<ClangdServer> Server;
   llvm::Optional<OffsetEncoding> NegotiatedOffsetEncoding;
+
+  // List of files that require explicit open.
+  std::set<std::string> RequiresExplicitClientOpen;
+  // Explicit or implicit open of a document, so we can
+  // manage DraftStore and ClangServer.
+  std::map<std::string, bool> IsOpen;
+  int MaxCacheSize = 2;
 };
 } // namespace clangd
 } // namespace clang
diff --git a/clang-tools-extra/clangd/test/didopen-not-required.test b/clang-tools-extra/clangd/test/didopen-not-required.test
new file mode 100644
index 00000000000..65f53a55126
--- /dev/null
+++ b/clang-tools-extra/clangd/test/didopen-not-required.test
@@ -0,0 +1,35 @@
+# Test that we can set choose a configuration/build directly in the initialize
+# request.
+
+# RUN: rm -rf %t.dir2 && mkdir -p %t.dir2
+# RUN: echo 'void foo(); int main() { foo(); }' > %t.dir2/main.cpp
+# RUN: echo '[{"directory": "%/t.dir2", "command": "c++ main.cpp", "file": "INPUT_DIR/main.cpp"}]' > %t.dir2/compile_commands.json
+
+# RUN: sed -e "s|INPUT_DIR|%/t.dir2|g" %s > %t.test.1
+
+# On Windows, we need the URI in didOpen to look like "uri":"file:///C:/..."
+# (with the extra slash in the front), so we add it here.
+# RUN: sed -E -e "s|file://|file:///|g" %t.test.1 > %t.test
+
+# RUN: clangd -lit-test < %t.test | FileCheck -strict-whitespace %t.test
+{"jsonrpc":"2.0","id":0,"method":"initialize","params":{}}
+---
+{"jsonrpc":"2.0","id":1,"method":"textDocument/symbolInfo","params":{"textDocument":{"uri":"file://INPUT_DIR/main.cpp"},"position":{"line":0,"character":26}}}
+#      CHECK:  "id": 1,
+# CHECK-NEXT:  "jsonrpc": "2.0",
+# CHECK-NEXT:    "result": [
+# CHECK-NEXT:      {
+# CHECK-NEXT:        "containerName": null,
+# CHECK-NEXT:        "id": "CA2EBE44A1D76D2A",
+# CHECK-NEXT:        "name": "foo",
+# CHECK-NEXT:        "usr": "c:@F@foo#"
+# CHECK-NEXT:      }
+# CHECK-NEXT:    ]
+# CHECK-NEXT:}
+---
+{"jsonrpc":"2.0","id":3,"method":"shutdown"}
+---
+{"jsonrpc":"2.0","method":"exit"}
+
+# RUN: rm -rf %t.dir2
+# RUN: rm -rf %t.dir2 %t.test %t.test.1

I’m sorry if I wasn’t clear in the original email thread, just wanted to re-iterate my view on this issue.
Although protocol does not require didOpen, clangd code is simpler if we assume didOpen is required.

This assumption works with major clients we care about, namely VSCode and YouCompleteMe.

I would rather avoid doing this at all and declare that clangd only works with clients that send didOpen/didClose requests. Supporting every aspect of the LSP spec or supporting every possible LSP client will take a lot of resources and increase complexity of our code, I don’t think we should be aiming for this.

Instead, I would suggest raising this issues with the authors of the corresponding LSP client. If they feel clangd can be useful for their ecosystem, they should be able to make a simple change to send didOpen and didClose on their side.