I’m using clangd on windows/linux for indexing company’s project. The project is organized (in terms of cpp files, header files) in um… unorthodox way so that we heavily rely on the precompiled-header technology.
The rough structure is layered and somewhat looks as following:
Layer 1: a bunch of headers that all are included into one stdafx.h on that level
Layer 2: first includes stdafx.h from Layer 1 and then also includes a bunch of own headers into own stdafx.h
Layer 3: includes stdafx.h from Layer 2 and then… you get the idea.
So while building we create precompiled header for the Nth layer once and re-use it via -include-pch flag when creating the PCH for the N+1th layer. Overall this gives acceptable full-rebuild times on a single machine.
Now in order to make clangd work on our project I had to ‘hack’ it a bit to make it understand certain source file dependencies and how to correctly infer compilation commands for particular cpp/header files in our case and it all works as expected however now the performance is somewhat of an issue and I believe it can be addressed if I could ‘teach’ clangd to deal with PCH the same way we do during the build process.
If someone could point me into the direction where relevant code in clangd is (e.g. for building/re-using preamble), that would be nice. And maybe some hints/advices along the way
You can find the relevant pieces in:
That being said;
Compiler is the common infra used by all components of clangd that invokes clang (the other 3 in this case). It also drops any pch related flags coming from the compile flags, because in PCH format is not stable and in theory they cannot even be reused between different versions of clang. Hence, unless you have clang you use to generate those PCHs and clangd you use in your editor you can’t really make re-use of them.
We turn it off for good reason because people use clangd even when they are not building their project with clang, hence this was causing lots of crashes.
We already do something similar with Preamble and ParsedAST to improve performance for currently active files, we build the preamble for a file only when it is invalidated (e.g. you change contents of a header, or edit the preamble section of the current file). Hence I assume the performance issue you mention is not really about the latency of current file operations (?).
As for background indexing, unfortunately this is worse. We don’t really re-use much between indexing of different translation units, since the process is meant to be incremental. But this causes troubles on projects with indefeasible build times. If that’s the problematic component indeed, you can either try to hack up clangd to somehow make use of preamble-like information between indexes, or use our centralized indexing/serving solution remote-index.
thanks for the answer!
the performance issue you mention is not really about the latency of current file operations
no, not really. I’d say it’s because the preamble in our case gets expensive to re-build if it’s down the hierarchy as it includes basically everything from previous layers. So I’d really like to take advantage of -include-pch option of clang.
I understand the logic behind throwing away any pch-related flags however in our case I don’t intend to actually re-use pch from our builds, but rather ‘teach’ clangd to understand our hierarchy of pch, specifically the dependencies between them so that a re-build of a particular preamble would not necessarily include the rebuild of the previous preambles that were included but not actually changed. And yes, in this case background index would be my primary first target I suppose.
Interestingly all of this is explained in this rather unexpected location: Precompiled Header and Modules Internals — Clang 16.0.0git documentation
Chained precompiled headers were explicitly introduced for this use-case incl. the mentioned preamble mechanism it seems.
clangd is probably a bit over-conservative here (does it even remove forced includes?).
Yes it shouldn’t try to re-use existing PCHs if the clang version that created them is different (but isn’t that detectable from the file header or something?).
But if it matches maybe? Or it could transparently rebase the relative .pch paths to its .cache path and create its own hierarchy of PCHs.