[F18/AST] Using clang tooling with f18

[Reposted with the correct clang mailing list -- sorry about the
duplicate...]

Hi all,

We've been having a bit of discussion over on flang-dev and wanted to
bring in clang people to comment/brainstorm. Here is the original post
that kicked this off on flang-dev:

  I was hoping f18 would lower to something akin to clang's AST.
  Obviously clang's AST doesn't directly apply to Fortran but perhaps
  some kind of common interface could exist so that clang tools could
  work with Fortran codes. It would be great to have things like the
  clang static analyzer and clang-doc for Fortran.

  Some tools will be language-specific of course but it seems like
  Fortran and C-family languages share enough common concepts that some
  tooling could work with both, given a common interface.
  Language-specific tools would work with a more language-specific
  interface.

  My impression from the presentation is that there's a lot more that
  could be shared with clang. The messaging system and command-line
  options infrastructure should be shareable, for example. Maybe
  there's already work being done in these areas to make f18 a
  first-class LLVM project.

Folks raised some concerns/areas to explore:

- Can we represent various Fortran constructs with additions to the
  clang AST? For example:

  * Implied DO loops
  * Array syntax
  * I/O statements (FORMAT, READ, NAMELIST, etc.)
  * Array declarations (DIMENSION, etc.)
  * Array syntax
  * ...

- Can clang's infrastructure handle various Fortran oddities like
  non-reserved keyworks and the ability to redefine constants?

  These may primarily be "dusty deck" issues and perhaps for tooling
  purposes a 98% solution is ok. F18 still needs to fully handle them,
  of course.

- Can we modify clang's AST and surrounding infrastructure to re-use
  bits of clang tooling for f18 or should we create some kind of common
  tooling interface for clang/f18 tooling that can also support
  language-specific bits?

- Can this be forward-looking for tooling for other languages (Rust,
  Chapel, Go, etc.)?

We pretty quickly came to a point where we needed input from clang
folks, so here we are. :slight_smile:

                                 -David

I know absolutely nothing about the AST, but why not just use LLVM's IR?

F18 currently lowers to LLVM IR, but that is too low-level for the kind of tooling we
want to have. Clang's tooling works on the AST because it is high-level enough to
express source language concepts and reason about them.

For example, one can't easily write a source-level documentation tool that works
with LLVM IR. The same goes for tools like code formatters, syntax highlighters
and anything else that wants to analyze and/or modify source.

                                 -David

Hi David,

There was a different Flang project before Nvidia’s involvement, which successfully used and extended Clang ASTs. The code is still on Github, along with updated fork (which builds with trunk). It implements a fair share of the Fortran features that you have mentioned using (what is essentially) Clang’s infrastructure.

IMO, it is possible to use Clang ASTs, but also there is a lot of other code and patterns in Clang that would be useful for a Fortran frontend - error handling, driver, testing, etc. It would make sense to take those things from Clang rather than reinvent them in F18.

Best,
Petr

There was a different Flang project before Nvidia's involvement, which
successfully used and extended Clang ASTs. The code is still on
Github, along with updated fork (which builds with trunk). It
implements a fair share of the Fortran features that you have
mentioned using (what is essentially) Clang's infrastructure.

Thanks for the pointer! Hopefully we can learn some things from that
project. It looks like it is still active (last commit three days ago),
but it lacks Fortran 95 support.

IMO, it is possible to use Clang ASTs, but also there is a lot of
other code and patterns in Clang that would be useful for a Fortran
frontend - error handling, driver, testing, etc. It would make sense
to take those things from Clang rather than reinvent them in F18.

I completely agree. The goal is to have f18 become an "official" LLVM
project. In that sense it will be the second frontend to LLVM (I know
about llgo but I don't think there was any attempt at reuse). There's a
tension between making rapid progress on f18 and reusing clang
components. Not having worked entensively in either codebase, I don't
have a good sense of how to resolve that tension.

                             -David

There was a different Flang project before Nvidia's involvement, which
successfully used and extended Clang ASTs. The code is still on
Github, along with updated fork (which builds with trunk). It
implements a fair share of the Fortran features that you have
mentioned using (what is essentially) Clang's infrastructure.

Thanks for the pointer! Hopefully we can learn some things from that
project. It looks like it is still active (last commit three days ago),
but it lacks Fortran 95 support.

You are welcome! Feel free to ask questions or participate. Yes, it is behind (as GSoC project that started it was), though it is making progress.

IMO, it is possible to use Clang ASTs, but also there is a lot of
other code and patterns in Clang that would be useful for a Fortran
frontend - error handling, driver, testing, etc. It would make sense
to take those things from Clang rather than reinvent them in F18.

I completely agree. The goal is to have f18 become an "official" LLVM
project. In that sense it will be the second frontend to LLVM (I know
about llgo but I don't think there was any attempt at reuse). There's a
tension between making rapid progress on f18 and reusing clang
components. Not having worked entensively in either codebase, I don't
have a good sense of how to resolve that tension.

Flang (the new one) also suffers from this, but it is understandable, since it is based on PGI Fortran compiler, which is much older than Clang. To me the fact that f19 also took a completely different route is a bit of a puzzle. I am not an LLVM insider, but I still doubt that multiple unrelated implementations of the same tooling would be acceptable.

-Petr