As discussed in the review for #147630, the first implementation step of the proposal (1. Checking for Module Presence) duplicates much of the logic from DependencyDirectivesScanner.cpp. Including DependencyDirectivesScanner.h would require linking the driver against clangLex.
The second step (2. Dependency Scanning) requires using Clang’s dependency scanning tooling. This adds dependencies on the following libraries:
To address this, we suggest four possible solutions:
Link the driver against all listed dependencies.
Move the dependency scan and/or check for C++20 module usage to -cc1, and have the driver spawn a separate process.
Make it configurable at build time whether the driver supports driver-managed module builds.
Use a callback in the driver to perform the check/scan, so it only works if you link a newly added dependency scanning library at a higher level and pass in the right arguments.
I’m not super comfortable with the complexity that would add to our library layering. For example, the driver is what’s responsible for setting up the diagnostics engine to be used by the compiler, so if the driver needs to use the lexer and the lexer decides it wants to emit a diagnostic, we’ve got a chicken-and-egg problem unless everything is held just right.
I think I would expect something more like option 4; the driver gets a particular command to do the depdendency scanning and so it calls into a different tool to do that work (basically, it’s similar to the driver being used to drive the linker instead of the compiler frontend). However, this isn’t a strongly held opinion, so I’m definitely interested to hear what others think.
Just to clarify, are you referring to option 2?
For option 4, the idea is to include the dependency scanning library higher up, such as in clang/tools/driver/driver.cpp, where many of the required libraries are already linked:
From there, we could pass a callback to the Driver to handle the dependency scan.
Any logic related to driver-managed module builds that doesn’t require additional library linking would still live in clang/lib/Driver/.
I’d like to make another argument for linking the libraries in some form.
The first step in implementing the RFC is to add a check to determine whether driver managed module builds should be implicitly enabled, without adding overhead to compilations that don’t use C++ named modules. This is done by skipping to the first line that might contain a C++ named module directive and lexing only that line, and currently requires linking clangLex.
Since the check should not add any significant overhead, using -cc1 to avoid linking clangLex is not possible for this. (The current approach adds negligible overhead; benchmarks).
We either have to link clangLex or duplicate some lexing logic.
For the actual dependency scan, using -cc1 would require serializing the dependency scan results, and in total, add even more overhead to driver-managed module builds.
Regarding the diagnostics specifically, I am able to emit diagnostics created by the scanning worker’s lexer using the driver’s diagnostic engine, even though it is more complex than emitting regular diagnostics. It works by storing them in a format similar to ASTUnit::StandaloneDiagnostic.
Here’s how I do it on my current dev branch: Definition: StandaloneDiagnostic, Reporting the StandaloneDiagostics from the Driver
I think I lean towards option 1. There are vanishingly few applications of Driver-as-a-library that don’t use the rest of Clang. Please keep the coupling with dependency scanning loose, but I think given the feature that needs to be implemented, I see why these deps are going to be necessary.
I think having jobs create jobs or any of the other options mentioned creates more complexity than these library dependencies.
A group of us (including the OP) discussed this at my office hours today and the group position was that this is reasonable to move forward with so long as the lexer can be correct by-construction or have sufficient assertions to help us catch any sort of “holding it wrong” problems where the lexer would need information the driver is responsible which isn’t set up yet. (Things like file system paths, diagnostic options, enabled features, etc.)
Eventually we think the driver will need to be reworked to have more of a pipeline of jobs, more akin to how the Swift driver works. But that’s a bigger project for another day.