RFC: refactoring libclangDriver/libclangFrontend to share with Flang


This is a refined design of how the new Flang driver will re-use the code currently available in Clang without depending on Clang (long-term goal). This was initially discussed in [1]. Based on the feedback and after a few weeks of prototyping [2] we are proposing a much smaller set of changes. Below is detailed summary of the design and how it will affect Clang. Your input is much appreciated!

* Make libclangDriver independent of Clang by:
    ** Creating a higher-level, reduced interface over DiagnosticEngine for compiler drivers to use that does not require Clang's SourceManager
    ** Lifting the TableGen backend for DiagnosticDriverKinds
* Move libclangDriver (together with the TableGen backend for DiagnosticDriverKinds) out of Clang

# *THINGS TO RE-USE*: libclangDriver
The Flang driver (i.e. "flang"), like Clang driver (i.e. "clang"), will be implemented in terms of libclangDriver. libclangDriver can already distinguish between various driver modes [3], including a dedicated mode for flang: clang::driver::Driver::FlangMode. We already use this mechanism (via ParsedClangName [4]) to start the driver in Driver::FlangMode (this is done inside "flang").

One of the key tasks of the driver is to parse the command line options and translate them into Actions. This part seems re-usable as is.

Next, based on the generated Actions the driver creates jobs (i.e. instances of clang::driver::Command). At this point (assuming that the driver is in
* a particular ToolChain/Tool is selected (for the preprocess phase [5], the ToolChain is implemented in Flang.cpp [6] and the selected tool is simply "flang -fc1")
* compiler driver options (e.g. options for "flang") are translated into options for the selected tool (e.g. options for "flang -fc1" in the preprocess phase)

The required top-level logic for this is already available in libclangDriver [6]. Any new logic that will apply only to Flang will be implemented in clang::driver::tools::Flang.

# *THINGS NOT TO RE-USE*: libclangFrontend
Once a job representing a call to the Flang frontend driver is constructed and a dedicate subprocesses is created, the Flang frontend driver (i.e. "flang -fc1") takes care of the rest. "flang -fc1" will be:
* a seperate entity (akin "clang -cc1" [7])
* independent of libclangFrontend (and "clang -cc1")
* implemented in terms of libflangFrontend (this library will be part of the Flang subproject)

So far the implementation of "flang -fc1" and libflangFrontend (available in our fork [2]) have been heavily inspired by "clang -cc1" and libclangFrontend, but otherwise are written from scratch. The Flang frontend driver is unlikely to re-use any code from Clang's frontend driver at this stage. This seems consistent with what people suggested in the past (in particular, see this reply from Richard Smith [8]).

Clang's SourceManager is only really needed by DiagnosticsEngine, but with libclangDriver limited usage of DiagnosticsEngine, we should be able to remove the dependency on SourceManager completely. From what we can see, libclangDriver doesn't really need it.

The end goal is to have a Flang compiler driver implemented in terms of libclangDriver that does not depend on Clang. This means extracting libclangDriver from Clang and moving it to a separate sub-project. To this end we have to make sure that libclangDriver no longer depends on Clang. This is the list of dependencies that we have identified:
* DiagnosticsEngine (+DiagnotsicOptions + DiagnosticIDs + DiagnosticConsumer)
* TableGen backend for generating error/warning definitions for DiagnosticsEngine

Although this list is short (perhaps we missed something?), it contains some rather complex and pervasive Clang classes that belong in libclangBasic. Fortunately, libclangDriver uses these classes to a rather limited extent.

DiagnosticsEngine is used by the driver to print warnings about user errors made in the options supplied. This is rather basic usage compared to reporting errors/warning generating by the parser or semantic analysis (e.g. we don't care about specific locations in files, macro expansions, etc). We propose creating a thin layer above DiagnosticsEngine to satisfy the dependencies of libclangDriver. This seems feasible and shouldn't be too disruptive.

The TableGen backend is required to generate DiagnosticDriverKinds.inc, i.e. the libclangDriver specific errors/warnings. Moving the corresponding TableGen backend out of Clang (together with libclangDriver) seems like the most straightforward approach to this. Any frontend specific diagnostic definitions should remain in Clang. Any use of these within libclangDriver can be dealt with on a case-by-case basis.

To handle Flang options we propose to:
* Use ClangFlags [9] to identify Flang options (we will add a dedicated enum for Flang, e.g. FlangOption)
* Tweak Driver::PrintHelp [10] to only display the appropriate options depending on the driver mode
* Add new Flang options for libClangDriver to the main DriverOptTable [11] table, perhaps via a separate *.td file

We think this has the benefit of being simple and extending existing interfaces. It may be worth investigating a way to make this scale out a bit more - cf. [12] - and we propose that as a future enhancement. We should emphasise that currently libclangDriver creates only one instance of DriverOptTable [11] that holds all available options. In our design this table will hold options for both Clang and Flang.

Flang will re-use many of the options already available via libclangDriver. C and C++ specific options are also relevant. A common pattern in HPC apps is mixed C++ and Fortran use in the same source base. In such mixed-source cases, it is useful for the compiler driver to be able to handle both at the same time. We will also add some new options, but it's unlikely to be a long list. Taking gfortran as a reference, the new options would be a very small fraction of what libclangDriver already supports.

The proposed changes (summarized at the top) are relatively small and will only affect libclangDriver. We would like to start upstreaming our patches into Flang at the same as lifting libclangDriver out of Clang into a separate project. This means submitting some patches into Clang while libclangDriver is still part of Clang. If the overall plan sounds sensible then shortly we'll prepare a separate, more detailed RFC that focuses on the usage of DiagnosticsEngine in libclangDriver.

All input appreciated.

Thanks for reading.

Andrzej Warzynski
On behalf on the Arm Fortran Team

[1] http://lists.llvm.org/pipermail/llvm-dev/2020-June/141994.html
[2] https://github.com/banach-space/llvm-project
[3] llvm-project/Driver.h at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[4] llvm-project/ToolChain.h at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[5] llvm-project/Phases.h at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[6] llvm-project/Flang.cpp at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[7] llvm-project/cc1_main.cpp at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[8] http://lists.llvm.org/pipermail/llvm-dev/2020-June/142024.html
[9] llvm-project/Options.h at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[10] llvm-project/Driver.cpp at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[11] llvm-project/DriverOptions.cpp at cbb3571b0df5a0948602aa4d2b913b64270143ff · llvm/llvm-project · GitHub
[12] http://lists.llvm.org/pipermail/llvm-dev/2020-July/143745.html

Thanks for the writeup, this seems like a good direction to me. Do you have a concrete proposal for where in the LLVM project the Driver will move to?

Hi Richard,

Do you have a concrete proposal for where in the LLVM project the Driver will move to?

When we discussed this last time, I proposed [1] to move this new library to a dedicated subproject: `frontend-support`. However, since then scope of this has been narrowed down.

I think that first we need to make libclangDriver independent of any Clang libraries. Once that's done, moving the new library to a dedicated sub-project should be relatively easy. The actual name/location could be discussed then - with a much shorter RFC :slight_smile: Until then everything stays in Clang.

Btw, do you have any preferences?


[1] http://lists.llvm.org/pipermail/llvm-dev/2020-June/142186.html