clangd, completion in header files

Hello list,

thank you for all the work on clangd. I tried it for the first time
with VSCode and I'm really impressed how useful it is just out of the
box.

I however encountered a problem with code completion in header files
and I would like to know if it's a bug in clangd or a problem in my
setup. I work on a C++ project which has headers with .h extension and
clangd incorrectly assumes it's a C code because it emits (besides
other things) "unknown type name 'namespace'" diagnostics message for
the namespace definition.

I run clangd -input-mirror-file for a while and I can see
textDocument/didOpen calls for the .h headers with "languageId" set to
"cpp". If I rename a header to .hpp, the problem disappears. I also
tried configuring custom file extension association for C++ in VSCode
and it worked as well. So I believe there must be something wrong just
with .h (i.e. clangd seems not to respect languageId for .h files).

Please, can you guide me how to diagnose the problem better so I can
eventually fill a bug report?

I also wonder how the completion is supposed to work in headers in
general. Is there some heuristics to guess the compilation flags
because the headers are not in the compilation database (at least when
generated by cmake)?

Regards,

Jan

+Sam and Ilya

Thanks for trying things out, and sorry for the bad header-file experience.

Hello list,

thank you for all the work on clangd. I tried it for the first time
with VSCode and I’m really impressed how useful it is just out of the
box.

I however encountered a problem with code completion in header files
and I would like to know if it’s a bug in clangd or a problem in my
setup.

Yeah, this is a known missing feature - guessing the right compile flags for headers.
If you have a compilation database (e.g. compile_commands.json) that provides compile commands for headers, then clangd works as expected. However most build systems don’t provide this information. (Bazel, for one, does).

When there’s no compile command provided, we fall back to ‘clang $filename’. Clang treats .h files as C.
But it’s also missing a bunch of other information: include paths, defines etc that are likely required to get useful results.
So I don’t think just diverging from clang here will actually help many projects (feel free to try this out by editing compile_commands.json - if this works for you it’d be good to know).

What can we do then? A few ideas:

  • we can preprocess all the files from compile_commands.json on startup (https://reviews.llvm.org/D41911 is the start of this). But we can’t guarantee we get to the file you care about in time, so behavior will be erratic.
  • we can pick a compile command arbitrarily and take its flags, which will get include paths and defines right if they’re uniform across the project
  • we can just refuse to provide any diagnostics/completions where we don’t have a known-good set of flags.

I’ve dumped these thoughts into https://bugs.llvm.org/show_bug.cgi?id=36899 - I also want to solicit some thoughts on this problem at the BoF session at the next LLVM dev meeting.

I’d like to find time to work on this in the next quarter, but I’m not certain - if anyone is interested in attacking this problem, let me know!

I work on a C++ project which has headers with .h extension and
clangd incorrectly assumes it’s a C code because it emits (besides
other things) “unknown type name ‘namespace’” diagnostics message for
the namespace definition.

I run clangd -input-mirror-file for a while and I can see
textDocument/didOpen calls for the .h headers with “languageId” set to
“cpp”. If I rename a header to .hpp, the problem disappears. I also
tried configuring custom file extension association for C++ in VSCode
and it worked as well. So I believe there must be something wrong just
with .h (i.e. clangd seems not to respect languageId for .h files).

Please, can you guide me how to diagnose the problem better so I can
eventually fill a bug report?

I think you’ve diagnosed it pretty well, but you can get a bit of extra information from the logs clangd writes to stderr. In VSCode, these are visible in the “output” pane, under “clang language server”. This will include the command used to build each file.

I also wonder how the completion is supposed to work in headers in
general. Is there some heuristics to guess the compilation flags
because the headers are not in the compilation database (at least when
generated by cmake)?

Regards,

Jan

Cheers, Sam

Another trick that we apply in KDevelop:

- Try to find the *.cpp file for the header based on some file naming
heuristics and use that instead.

This works well in the majority of cases, but fails for system headers,
header-only utilities, and headers you just started to write where the
accompanying *.cpp file is still empty.

Bye

Hi,

In a YCM extension, we have the following heuristics to get compile
flags for a header:
- Try to find a TU for the header in the compile DB by using the
basename of the header.
- If still not found then try to browse the DB to find one TU which
uses the directory of our header as an include path (-I, -isystem).
- If still not found then we try to get the first sibling TU in the
directory of the header file.
- If still not found then fall back to the first entry in the compile
DB and use its flags.

In the last 3 years it has been working remarkable well for us (7
users). It handles nicely the cases when we add a new header.
It may be a bit slow on very large projects and it does not handle
multiplatform builds.
I hope you find some of these ideas helpful. The extension is available here:

Cheers,
Gabor

Thanks Milian and Gábor,
I’d been a bit reluctant about using filename heuristics but both your experience and concrete suggestions are compelling.
This has the side benefit that it could reasonably fit inside the CompilationDatabase abstraction, and if we had a high quality implementation we could even make this behavior the default for compile_commands.json processing in all tools. (Headers etc shouldn’t be enumerated of course, but if a tool explicitly asks how to compile one…)
I’ll try this idea out and see how it works, unless someone gets to it first.
Cheers, Sam

There is another place where such heauristic may be related.

A project may build multiple executables but there is no way to tell
which file a global function call (or variable).
(Linking options are not specified in JSON compilation database and
I can imagine specifying it will be infeasible.)

I personally prefer listing all definitions sharing the same Clang USR,
but there can be heuristics to find the most possible file that defines
the symbol, and make it rank top in the textDocument/definition response list.

I wonder if, as a fallback, `clang -x objective-c++ $filename` would be more generally useful... it puts Clang into "accept almost everything" mode. This is the mode LLDB uses for expression evaluation IIRC.

That’s a good point. Somehow I forgot obj-c++ was a thing… We should switch this.

One silly downside though - I find just “-x c++” won’t actually give useful results on real projects.
Does this match others’ experience?
The “Unknown type name ‘namespace’” error I tend to get from the current behavior is at least predictable and recognizable. If we were parsing as obj-c++ but missing other flags, problems could be more subtle.

But we could of course do even better here - if the CDB could report when a command is fallback/inferred, clangd could insert a warning at the first line of the file. “No compile flags found, using ‘clang -x objective-c++’” or so.

What do you think?

I think warning that we don’t know the right compile flags is a great idea.

Thank you all for the responses.

Yeah, this is a known missing feature - guessing the right compile flags for
headers.
If you have a compilation database (e.g. compile_commands.json) that
provides compile commands for headers, then clangd works as expected.
However most build systems don't provide this information. (Bazel, for one,
does).

I see. That makes perfect sense. We use cmake which doesn't export
compile commands for headers unfortunately.

When there's no compile command provided, we fall back to 'clang $filename'.
Clang treats .h files as C.
But it's also missing a bunch of other information: include paths, defines
etc that are likely required to get useful results.
So I don't think just diverging from clang here will actually help many
projects (feel free to try this out by editing compile_commands.json - if
this works for you it'd be good to know).

I have tried adding a compile_commands.json entry for a header with
flags matching the cpp file. The headers were recognized as C++
because the former errors on keywords like namespace were gone but
something was still wrong. Some STL declarations were not found -
std::shared_ptr for instance. Running the compiler command from
terminal showed 'clang-6.0: warning: treating 'c-header' input as
'c++-header' when in C++ mode, this behavior is deprecated
[-Wdeprecated]' which made me try adding '-x c++'. That fixed the
remaining problem but it's strange as the code was already recognized
as C++ and the command contains explicit -std=c++14 option.

What can we do then? A few ideas:

There are some good suggestions. I'm sure you will find some good
default and I'm looking forward to try it out. I can just confirm that
the heuristic used by YouCompleteMe is quite nice because that's what
I've been using so far. It's not optimal but it works reasonably well.

Cheers,

Jan

Just landed r329582 which has does some guessing when there’s no compile command available.
Please try it out!

Related changes I’d still like to make:

  • parse *.h as obj-c++ when there’s no compilation database at all (out for review)
  • inject a diagnostic when we’re using a fallback/guessed command for diagnostics