Hi,
Bruno (CCed), Duncan (CCed) and I have been exploring if we can migrate some of our clients to explicit modules. As part of this work Duncan and I developed a new prototype dependency scanning service tool (clang-scan-deps) that computes the set of file dependencies for a particular compiler invocation using some optimizations that are outlined below. This tool makes the non-modular dependency scanning up to 10 times faster for particular workloads (e.g. llc target, 1542 C++ files) on one of our machines, when compared to parallel invocations of clang with -Eonly. We are still in the early stages of proper modules support, but our initial crude prototype can get up to 4x when run on the first 1000 files from clang’s compilation database for a build of LLVM with modules turned on.
We still run the full Clang preprocessor. Here’s what we do to reduce its workload:
- Minimize sources by stripping away unused tokens. We keep only the interesting PP directives (#define, #if, #include, etc.), i.e. those that might impact the set of dependencies.
- Assume the filesystem is immutable for one run of the service, and cache the files and their minimized contents in memory in a global cache.
- Skip over excluded preprocessor ranges by bumping up the buffer pointer in the lexer instead of lexing the skipped tokens.
We intend to upstream this service in the upcoming months. We also would like to integrate this service into Clangd as part of our migration to Clangd to help us determine a good compilation command for a header file from a set of known compilation invocations.
I posted a very rough WIP patch on Phabricator (https://reviews.llvm.org/D53354). It’s based on LLVM checkout r343343. Please take a look if you’re interested.
Duncan, Bruno and I will be at the LLVM dev meeting. We are interested in discussing this prototype and collecting feedback from anyone who might be interested in this work.
Thanks,
Alex