RFC for f18+runtimes in LLVM

Hi, everyone,

As you may know, NVIDIA has developed an open-source Fortran frontend for LLVM (http://flang-compiler.org), which consists of the flang frontend itself along with the corresponding Fortran runtime library. The existing frontend’s code is mostly written in C, and while a production-quality implementation, does not follow modern software-engineering practices.

Our long-standing desire is that our work forms the basis of a Fortran frontend that is part of the LLVM project and developed within the LLVM community. In addition, we would like the frontend to be useful in the many ways that Clang is useful: not just as a frontend, but also for static analysis and tooling.

Recognizing that the current flang’s code base will not meet these needs, we started a ground-up rewrite of the frontend in modern C++. This effort is called f18.

At this point, we have documented and implemented a healthy subset of the compiler for symbol tables and scoping, name resolution, USE statements and module files, constant representation, constant folding and much of declaration, label and expression semantics. The parser handles all of Fortran 2018 and OpenMP 4.5 and implements a Fortran-aware preprocessor. The Fortran control flow graph (CFG) is in review now. We continue to update other documentation, such as the style guide and runtime descriptor design.

Before this effort proceeds much farther, we’d like to contribute f18, and the existing Fortran runtime libraries, to the LLVM project. The Fortran frontend would become a proper LLVM subproject, design discussions would take place on an associated LLVM mailing list, code reviews would use phabricator, and so on.

We’re committed to developing LLVM’s Fortran frontend for years to come, and together with other members of the LLVM community (e.g., ARM, US Dept of Energy) would like to do so as part of the LLVM project.

  • The Fortran runtime library will be moved into a subdirectory of this new LLVM subproject, or compiler-rt, or its own subproject (based on feedback here).

  • The current f18 code will be committed to the new LLVM subproject. The f18 code is a set of libraries that implements the Fortran compiler. Today, there’s a rudimentary driver; however, we propose to adapt the clang driver to accept Fortran source and invoke f18 as appropriate. We would also adapt the f18 libraries to fit into the existing llvm build systems and testing infrastructure.

The f18 compiler source code complies with most of LLVM’s coding guidelines; however, the code uses several C++17 features. We’ve documented our use of C++17 here:

https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

In particular, the parse tree and the lowered forms of expressions and variables are defined in terms of C++17 std::variant. Most of the compiler uses C++17 std::visit to walk these data structures.

It’s possible to reimplement the most important functionality of std:variant as a subset class, say llvm:variant; however, variant gets its power from the C++17 features generic lambdas and parameter pack expansion on “using”. Without these C++17 features, use of variant would be impractical.

Our thinking when we started was that llvm would adopt C++17 before mid-2020, which lines up with our projected completion date. If we were to adopt C++11 or C++14, we would likely create substitutes for these classes, certainly at a cost of calendar time and perhaps type safety and notational convenience. One of our principles is to take advantage of the standard library as much as possible, so casual readers will better understand our code and so we avoid the time and bugs associated with writing class libraries.

Our request would be to get a waiver for the C++11 requirement based on the fact that we’re skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

Also, as part of the ongoing flang effort, NVIDIA has developed a library of scalar, vector, and masked math functions. This library should be useful for autovectorization, and OpenMP SIMD, in general. We’ll create another RFC to discuss this contribution.

Please let us know if you have any feedback or concerns.

  • Steve

+1. The reasons why we haven’t yet switched LLVM to c++17 do not apply to a brand new frontend, since it’s effectively all about not disrupting current users, and a brand new frontend does not yet have those.

So, IMO, there’s no reason f18 cannot have more stringent requirements at the time of its introduction, and then going forward it should follow along with the rest of the project’s minimum toolchain requirements, after those eventually meet and surpass f18’s initial requirements.

James Y Knight via llvm-dev <llvm-dev@lists.llvm.org> writes:

  • The current f18 code will be committed to the new LLVM subproject. The f18 code is a set of libraries that implements the Fortran compiler.

Awesome. This is an important aspect of the design of LLVM projects IMO → they build their functionality primarily as re-usable libraries, and then expose that in useful command line utilities.

The f18 compiler source code complies with most of LLVM’s coding guidelines; however, the code uses several C++17 features. We’ve documented our use of C++17 here:

https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

In particular, the parse tree and the lowered forms of expressions and variables are defined in terms of C++17 std::variant. Most of the compiler uses C++17 std::visit to walk these data structures.

It’s possible to reimplement the most important functionality of std:variant as a subset class, say llvm:variant; however, variant gets its power from the C++17 features generic lambdas and parameter pack expansion on “using”. Without these C++17 features, use of variant would be impractical.

Our thinking when we started was that llvm would adopt C++17 before mid-2020, which lines up with our projected completion date. If we were to adopt C++11 or C++14, we would likely create substitutes for these classes, certainly at a cost of calendar time and perhaps type safety and notational convenience. One of our principles is to take advantage of the standard library as much as possible, so casual readers will better understand our code and so we avoid the time and bugs associated with writing class libraries.

Our request would be to get a waiver for the C++11 requirement based on the fact that we’re skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

Hold on, either it is a collection of libraries or it is a stand-alone program. It can’t really be both?

Generally, I think the idea that diverging from the rest of the project here is low-cost for a subproject isn’t supported by experience with other projects.

Notably, it has a strong tendancy to create tension. You want some ADT or support library in LLVM to work well with your C++17 code. But it is C++11. Every time this has been done in the past, the result has been that generically useful tools and libraries get added to the subproject rather than to LLVM as a whole.

So FWIW, I’d be really opposed to this. Instead, I think that F18 should have rich libraries, and develop them exactly the same way as the rest of LLVM.

We’re getting close to switching to C++14, so maybe due to timing, you could merge F18 when that happens?

Ultimately, I think you either need to raise the LLVM base language version or lower the F18 one so that they match when merged IMO. Anything else I think will hamper integration with the larger project.

FWIW, I think if the LLVM project will not accept C++17 code soon, then f18 should detect C++17 support and if it’s not there, use local implementations. Cray once released a library back when some compilers did not yet support C++11, so I detected if C++11 support existed and used my own implementations when it wasn’t present. For some things, it was trivial, for others, really annoying…but it worked. It would be lovely to see f18 in the new LLVM repo sooner than later, but obviously if f18 merges in after C++14 is allowed, then less work is required to duplicate std:: functionality. It’s a tradeoff.

-Troy

  • The current f18 code will be committed to the new LLVM subproject. The f18 code is a set of libraries that implements the Fortran compiler.

Awesome. This is an important aspect of the design of LLVM projects IMO → they build their functionality primarily as re-usable libraries, and then expose that in useful command line utilities.

The f18 compiler source code complies with most of LLVM’s coding guidelines; however, the code uses several C++17 features. We’ve documented our use of C++17 here:

https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

In particular, the parse tree and the lowered forms of expressions and variables are defined in terms of C++17 std::variant. Most of the compiler uses C++17 std::visit to walk these data structures.

It’s possible to reimplement the most important functionality of std:variant as a subset class, say llvm:variant; however, variant gets its power from the C++17 features generic lambdas and parameter pack expansion on “using”. Without these C++17 features, use of variant would be impractical.

Our thinking when we started was that llvm would adopt C++17 before mid-2020, which lines up with our projected completion date. If we were to adopt C++11 or C++14, we would likely create substitutes for these classes, certainly at a cost of calendar time and perhaps type safety and notational convenience. One of our principles is to take advantage of the standard library as much as possible, so casual readers will better understand our code and so we avoid the time and bugs associated with writing class libraries.

Our request would be to get a waiver for the C++11 requirement based on the fact that we’re skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

Hold on, either it is a collection of libraries or it is a stand-alone program. It can’t really be both?

Generally, I think the idea that diverging from the rest of the project here is low-cost for a subproject isn’t supported by experience with other projects.

Notably, it has a strong tendancy to create tension. You want some ADT or support library in LLVM to work well with your C++17 code. But it is C++11. Every time this has been done in the past, the result has been that generically useful tools and libraries get added to the subproject rather than to LLVM as a whole.

If there are such features that ought to be added to the support libraries in LLVM for better C++17 support, then they can indeed be added, with an appropriate #ifdef on language version, no?

The primary reason that I think it makes sense to allow f18 to require C++17 is that this will be a temporary divergence – the rest of the LLVM project will certainly move to require C++17 as well at some point relatively soon. Any C++17-specific improvements made to llvm common libraries will be useful for the rest of LLVM too.

I don’t know what the expected timeline of f18 completion is, nor would I like to predict excatly how long it’ll be before LLVM might start requiring C++17, but it certainly seems possible that LLVM might be ready to require c++17 before f18 is even finished.

So FWIW, I’d be really opposed to this. Instead, I think that F18 should have rich libraries, and develop them exactly the same way as the rest of LLVM.

We’re getting close to switching to C++14, so maybe due to timing, you could merge F18 when that happens?

Ultimately, I think you either need to raise the LLVM base language version or lower the F18 one so that they match when merged IMO. Anything else I think will hamper integration with the larger project.

Even if a decision is made to rewrite parts of the code in order to not rely on C++17 features, I don’t think that the initial import of the project ought to be tied to that task being completed. More generally, I think the prerequisite to merging it should be having an agreed-upon target state and an understood path on how to reach that state, rather than the code actually being in that state already. Merging sooner is generally better than waiting and merging later.

I went and looked at its repo out of curiosity, and I could not tell which way it is meant to be. Unless I am missing something, It appears that neither f18 sources have any CMake install directives, nor the docs have any description on whether/how it would be used as a set of libraries. Just wondering out loud, are there any plans for documenting that?

Cheers,
Petr

Chandler Carruth via llvm-dev <llvm-dev@lists.llvm.org> writes:

Notably, it has a strong tendancy to create tension. You want some ADT
or support library in LLVM to work well with your C++17 code. But it
is C++11. Every time this has been done in the past, the result has
been that generically useful tools and libraries get added to the
subproject rather than to LLVM as a whole.

Building on this, there was some discussion months ago on flang-dev
about f18 tooling and whether it could integrate with existing clang
libraries. Whatever became of that? I recall some references to
earlier iterations of flang that successfully extended clang's AST to
handle Fortran and thereby leveraged much of the existing great clang
tooling infrastructure.

My sense of f18 is that it is very much a "ground up" implementation and
doesn't make use of any of the existing clang infrastructure. It would
be a real shame to have two completely separate projects that provide
many of the same services. It seems like a good idea to have a
conversation about f18 tooling before we add f18 as a subproject and
have to support two independent tooling efforts.

                               -David

lld used C++11 before the rest of LLVM switched over without issue.

  • Michael Spencer
  • The current f18 code will be committed to the new LLVM subproject. The f18 code is a set of libraries that implements the Fortran compiler.

Awesome. This is an important aspect of the design of LLVM projects IMO → they build their functionality primarily as re-usable libraries, and then expose that in useful command line utilities.

The f18 compiler source code complies with most of LLVM’s coding guidelines; however, the code uses several C++17 features. We’ve documented our use of C++17 here:

https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

In particular, the parse tree and the lowered forms of expressions and variables are defined in terms of C++17 std::variant. Most of the compiler uses C++17 std::visit to walk these data structures.

It’s possible to reimplement the most important functionality of std:variant as a subset class, say llvm:variant; however, variant gets its power from the C++17 features generic lambdas and parameter pack expansion on “using”. Without these C++17 features, use of variant would be impractical.

Our thinking when we started was that llvm would adopt C++17 before mid-2020, which lines up with our projected completion date. If we were to adopt C++11 or C++14, we would likely create substitutes for these classes, certainly at a cost of calendar time and perhaps type safety and notational convenience. One of our principles is to take advantage of the standard library as much as possible, so casual readers will better understand our code and so we avoid the time and bugs associated with writing class libraries.

Our request would be to get a waiver for the C++11 requirement based on the fact that we’re skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

Hold on, either it is a collection of libraries or it is a stand-alone program. It can’t really be both?

Generally, I think the idea that diverging from the rest of the project here is low-cost for a subproject isn’t supported by experience with other projects.

Notably, it has a strong tendancy to create tension. You want some ADT or support library in LLVM to work well with your C++17 code. But it is C++11. Every time this has been done in the past, the result has been that generically useful tools and libraries get added to the subproject rather than to LLVM as a whole.

So FWIW, I’d be really opposed to this. Instead, I think that F18 should have rich libraries, and develop them exactly the same way as the rest of LLVM.

We’re getting close to switching to C++14, so maybe due to timing, you could merge F18 when that happens?

Ultimately, I think you either need to raise the LLVM base language version or lower the F18 one so that they match when merged IMO. Anything else I think will hamper integration with the larger project.

lld used C++11 before the rest of LLVM switched over without issue.

I don’t 100% agree – we did end up with a bunch of support library components in LLD that had to be migrated back to LLVM. =/ The story with LLDB had more issues.

It may have been small enough and limited in time enough to not become a large problem for LLD, but it still isn’t something I’d like to repeat.

If we had some super concrete timeframe for when the rest of LLVM would switch to C++17 (again, we’ve only even discussed C++14 so far!), that might help. But currently, I think this is going to cause divergence without benefit.

  • The current f18 code will be committed to the new LLVM subproject. The f18 code is a set of libraries that implements the Fortran compiler.

Awesome. This is an important aspect of the design of LLVM projects IMO → they build their functionality primarily as re-usable libraries, and then expose that in useful command line utilities.

The f18 compiler source code complies with most of LLVM’s coding guidelines; however, the code uses several C++17 features. We’ve documented our use of C++17 here:

https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

In particular, the parse tree and the lowered forms of expressions and variables are defined in terms of C++17 std::variant. Most of the compiler uses C++17 std::visit to walk these data structures.

It’s possible to reimplement the most important functionality of std:variant as a subset class, say llvm:variant; however, variant gets its power from the C++17 features generic lambdas and parameter pack expansion on “using”. Without these C++17 features, use of variant would be impractical.

Our thinking when we started was that llvm would adopt C++17 before mid-2020, which lines up with our projected completion date. If we were to adopt C++11 or C++14, we would likely create substitutes for these classes, certainly at a cost of calendar time and perhaps type safety and notational convenience. One of our principles is to take advantage of the standard library as much as possible, so casual readers will better understand our code and so we avoid the time and bugs associated with writing class libraries.

Our request would be to get a waiver for the C++11 requirement based on the fact that we’re skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

Hold on, either it is a collection of libraries or it is a stand-alone program. It can’t really be both?

Generally, I think the idea that diverging from the rest of the project here is low-cost for a subproject isn’t supported by experience with other projects.

Notably, it has a strong tendancy to create tension. You want some ADT or support library in LLVM to work well with your C++17 code. But it is C++11. Every time this has been done in the past, the result has been that generically useful tools and libraries get added to the subproject rather than to LLVM as a whole.

If there are such features that ought to be added to the support libraries in LLVM for better C++17 support, then they can indeed be added, with an appropriate #ifdef on language version, no?

The primary reason that I think it makes sense to allow f18 to require C++17 is that this will be a temporary divergence – the rest of the LLVM project will certainly move to require C++17 as well at some point relatively soon. Any C++17-specific improvements made to llvm common libraries will be useful for the rest of LLVM too.

How would we even test these changes on different platforms? Our current bots won’t test it, I don’t want two sets of bots, and upgrading all of the bots seems like most of the work to get to C++17.

I don’t know what the expected timeline of f18 completion is, nor would I like to predict excatly how long it’ll be before LLVM might start requiring C++17, but it certainly seems possible that LLVM might be ready to require c++17 before f18 is even finished.

If we have a good reason to believe that’s the case, fine. But currently, my suspicion is that F18’s timeframe is much more definite.

So FWIW, I’d be really opposed to this. Instead, I think that F18 should have rich libraries, and develop them exactly the same way as the rest of LLVM.

We’re getting close to switching to C++14, so maybe due to timing, you could merge F18 when that happens?

Ultimately, I think you either need to raise the LLVM base language version or lower the F18 one so that they match when merged IMO. Anything else I think will hamper integration with the larger project.

Even if a decision is made to rewrite parts of the code in order to not rely on C++17 features, I don’t think that the initial import of the project ought to be tied to that task being completed. More generally, I think the prerequisite to merging it should be having an agreed-upon target state and an understood path on how to reach that state, rather than the code actually being in that state already. Merging sooner is generally better than waiting and merging later.

Sure, I’m happy for the work to match the rest of LLVM to happen “in tree”, but until then I think it should be disabled-by-default in the CMake build and not expected that folks update it when changing APIs etc. Similar to what we do w/ experimental targets.

  • The current f18 code will be committed to the new LLVM subproject. The f18 code is a set of libraries that implements the Fortran compiler.

Awesome. This is an important aspect of the design of LLVM projects IMO → they build their functionality primarily as re-usable libraries, and then expose that in useful command line utilities.

The f18 compiler source code complies with most of LLVM’s coding guidelines; however, the code uses several C++17 features. We’ve documented our use of C++17 here:

https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

In particular, the parse tree and the lowered forms of expressions and variables are defined in terms of C++17 std::variant. Most of the compiler uses C++17 std::visit to walk these data structures.

It’s possible to reimplement the most important functionality of std:variant as a subset class, say llvm:variant; however, variant gets its power from the C++17 features generic lambdas and parameter pack expansion on “using”. Without these C++17 features, use of variant would be impractical.

Our thinking when we started was that llvm would adopt C++17 before mid-2020, which lines up with our projected completion date. If we were to adopt C++11 or C++14, we would likely create substitutes for these classes, certainly at a cost of calendar time and perhaps type safety and notational convenience. One of our principles is to take advantage of the standard library as much as possible, so casual readers will better understand our code and so we avoid the time and bugs associated with writing class libraries.

Our request would be to get a waiver for the C++11 requirement based on the fact that we’re skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

Hold on, either it is a collection of libraries or it is a stand-alone program. It can’t really be both?

Generally, I think the idea that diverging from the rest of the project here is low-cost for a subproject isn’t supported by experience with other projects.

Notably, it has a strong tendancy to create tension. You want some ADT or support library in LLVM to work well with your C++17 code. But it is C++11. Every time this has been done in the past, the result has been that generically useful tools and libraries get added to the subproject rather than to LLVM as a whole.

So FWIW, I’d be really opposed to this. Instead, I think that F18 should have rich libraries, and develop them exactly the same way as the rest of LLVM.

We’re getting close to switching to C++14, so maybe due to timing, you could merge F18 when that happens?

Ultimately, I think you either need to raise the LLVM base language version or lower the F18 one so that they match when merged IMO. Anything else I think will hamper integration with the larger project.

lld used C++11 before the rest of LLVM switched over without issue.

I don’t 100% agree – we did end up with a bunch of support library components in LLD that had to be migrated back to LLVM. =/ The story with LLDB had more issues.

The support code that ended up in lld instead of libSupport ended up there mostly because it wasn’t viewed as useful to the rest of LLVM, not because of language version. I wasn’t aware of any LLDB issues as I don’t follow it that closely.

  • Michael Spencer

Developing F18 in-tree but off by default, like with experimental targets, seems fine to me. It’s experimental, and I don’t expect them to get tested by bots (unless those bots do extra work). Then they can use C++17 all they want, and when they’re close to done we’ll have clarity on whether LLVM would actually move to C++17. If not they’d have to downgrade to C++14, if LLVM is ready (or almost ready) then F18 is fine (they can even pitch in and help move to C++17 by then).

This is especially true if, for now, they don’t use any LLVM infrastructure (including ADT). As the project matures in-tree I’d expect them to try to use ADT more. That might break them, and then we can have a discussion: make ADT work well with C++14 and C++17, or force F18 to do extra work (which they already do!). If anything, having the F18 folks use ADT might ease the migration to C++17.

We're committed to developing LLVM's Fortran frontend for years to come, and together with other members of the LLVM community (e.g., ARM, US Dept of Energy) would like to do so as part of the LLVM project.

This is super exciting Stephen, congratulations to you and everyone working on f18. I’m very excited to see this happening and am thrilled about the approach you are taking.

The f18 compiler source code complies with most of LLVM's coding guidelines; however, the code uses several C++17 features. We've documented our use of C++17 here:

  https://github.com/flang-compiler/f18/blob/master/documentation/C++17.md

Our request would be to get a waiver for the C++11 requirement based on the fact that we're skating to where the puck will be. In the meantime, because F18 only exists as a stand-alone program, early adopters would still have a useful parser and analyzer for Fortran.

I personally see no problem or concerns with this at all. This is a new project and the worst case is that f18 comes up with some really cool stuff that the rest of the LLVM project would love to share, but that can’t be done until it is refactored to not use c++11. If/when that comes up, we can deal with it on demand. I don’t see any particular reason to block f18 from joining the project because of that speculative concern.

-Chris

Following up on my earlier email. If there is a commitment to checking in f18 already, feel free to disregard it. I went and took a little bit closer look at the sources and want to share some of the findings in case if anyone is interested. Disclosure: I contribute to Fort (fort-compiler.org), which is the fork of the front-end David Greene mentioned.

From Stephen’s announcement:

At this point, we have documented and implemented a healthy subset of the compiler for symbol tables and scoping, name resolution, USE statements and module files, constant representation, constant folding and much of declaration, label and expression semantics. The parser handles all of Fortran 2018 and OpenMP 4.5 and implements a Fortran-aware preprocessor. The Fortran control flow graph (CFG) is in review now. We continue to update other documentation, such as the style guide and runtime descriptor design.

Currently it looks like only the parser is partially implemented in f18, there is no code generator (via LLVM or otherwise) and, obviously, no object output. For that reason and due to the condition of its test suite it is impossible to reliably assess the state of Fortran 18 support (thought it does look like a fair amount of effort went into it). State of OpenMP support actually got me a bit puzzled, more on that below.

As I understand the announcement, f18 is intended to be used or merged with Flang sources at some point, but that still does not explain how it would integrate with LLVM, since Flang does not seem invoke LLVM directly either (it used to produce LLVM IR as text files). Because of this, it is likely that its code generator component would have to be written from scratch. It is also unclear if and how it would provide the library API which has been announced.

A bit about the test suite – I looked at the Fortran (regression) part of it (as opposed to unit tests, which hopefully are a relatively simple affair). Maybe nitpicking, but despite “handles all of OpenMP 4.5” statement in the announcement there seem to be only two references to OpenMP in tests. Most of the regression tests are challenging to understand – some list all expected output upfront, some of the expected output is not particularly human-friendly. Maybe I am used to Clang’s test suite, but it is unclear to me what each file is testing. Also, regression testing relies on a set of shell scripts to do some of the output checking.

My worry here that it would actually take years to develop f18 into an a working compiler, in which case there might be other options worth considering for a Fortran front-end. In my opinion (and this may be a matter of personal preference) a healthier subset of the compiler would be more of an end-to-end subset of it – something that can be tested as a full product while it is being developed. And then there is also the argument for reusing Clang tooling, which David Greene keeps making, though that idea does not seem to get a lot of interest.

Best,

Petr

Petr Penzin via llvm-dev <llvm-dev@lists.llvm.org> writes:

My worry here that it would actually take years to develop f18 into an
a working compiler, in which case there might be other options worth
considering for a Fortran front-end. In my opinion (and this may be a
matter of personal preference) a healthier subset of the compiler
would be more of an end-to-end subset of it -- something that can be
tested as a full product while it is being developed. And then there
is also the argument for reusing Clang tooling, which David Greene
keeps making, though that idea does not seem to get a lot of interest.

Petr, can you describe a bit about how tooling works in Fort, so we all
have a better idea of the challenges involved? I see the sources have a
layout similar to clang. Is it reusing any clang code directly?

What is the status of Fort? What standard(s) does it support and what
is planned for future standard support? Does it support a working
subset of Fortran? Does it provide a Fortran runtime library or does it
use existing runtimes? Any OpenMP support?

Anything else we should know about Fort?

                         -David

This RFC started a good discussion and I’d like to hear responses from its author to all of the points that have been made so far.

FWIW, I’m also in favor of reusing as much from Clang as practical. In fact, with the combined repo now, it might make sense to factor out some common front end code that both a Clang and any Flang (f18 or Fort) would use, for maintainability as well as to avoid the perceived strangeness of a Fortran front end relying on a C front end.

-Troy

What part of the frontend do you think could be shared ? At least the following seems
to be re-usable:

- The diagnostics infrastructure
- IdentifierTable
- The file manager and source location infrastructure

Bruno

Bruno Ricci via llvm-dev <llvm-dev@lists.llvm.org> writes:

What part of the frontend do you think could be shared ? At least the following seems
to be re-usable:

- The diagnostics infrastructure
- IdentifierTable
- The file manager and source location infrastructure

I would like to see the possibility of clang's AST being enhanced to
handle Fortran. Then I believe a lot of the clang tooling (clang-query,
etc.) could be more easily made to understand Fortran. I have no doubt
there is a significant amount of work to do this. That's why I asked
Petr about the experience of the Fort team. Is it more work to add the
necessary bits to clang's AST and its tooling, or is it more work to
construct a new AST ground-up and then construct all the tooling around
that ground-up? I suspect the latter is more work to achieve the same
quality as in clang, though depending on the ultimate goal for a Fortran
frontend, it may be less (say we don't want all of the clang-style
tooling for Fortran, for example).

Maybe even a lot of codegen could be reused, I don't know. I am hardly
a clang expert. :slight_smile: OpenMP lowering could maybe also be reused, though
moving that to LLVM (as I've seen discussed) would then naturally reuse
it.

                                -David

Following up on my earlier email. If there is a commitment to checking in f18 already, feel free to disregard it. I went and took a little bit closer look at the sources and want to share some of the findings in case if anyone is interested. Disclosure: I contribute to Fort<http://fort-compiler.org/&gt; (fort-compiler.org), which is the fork of the front-end David Greene mentioned.

From Stephen's announcement:
At this point, we have documented and implemented a healthy subset of the compiler for symbol tables and scoping, name resolution, USE statements and module files, constant representation, constant folding and much of declaration, label and expression semantics. The parser handles all of Fortran 2018 and OpenMP 4.5 and implements a Fortran-aware preprocessor. The Fortran control flow graph (CFG) is in review now. We continue to update other documentation, such as the style guide and runtime descriptor design.
Currently it looks like only the parser is partially implemented in f18, there is no code generator (via LLVM or otherwise) and, obviously, no object output. For that reason and due to the condition of its test suite it is impossible to reliably assess the state of Fortran 18 support (thought it does look like a fair amount of effort went into it). State of OpenMP support actually got me a bit puzzled, more on that below.

As I understand the announcement, f18 is intended to be used or merged with Flang sources at some point, but that still does not explain how it would integrate with LLVM, since Flang does not seem invoke LLVM directly either (it used to produce LLVM IR as text files). Because of this, it is likely that its code generator component would have to be written from scratch. It is also unclear if and how it would provide the library API which has been announced.

A bit about the test suite -- I looked at the Fortran (regression) part of it (as opposed to unit tests, which hopefully are a relatively simple affair). Maybe nitpicking, but despite "handles all of OpenMP 4.5" statement in the announcement there seem to be only two references to OpenMP in tests. Most of the regression tests are challenging to understand -- some list all expected output upfront, some of the expected output is not particularly human-friendly. Maybe I am used to Clang's test suite, but it is unclear to me what each file is testing. Also, regression testing relies on a set of shell scripts to do some of the output checking.

My worry here that it would actually take years to develop f18 into an a working compiler,

I believe that your assessment below says more about the state of the regression tests in f18 than the capabilities of the code. f18 can currently parse, and unparse, a lot of Fortran code (+OpenMP). From talking to Steve and company, this part of the current codebase is well tested. Based on the current rate of development and experience of the team, and I don't expect it to take many years to develop f18 into a working compiler.

in which case there might be other options worth considering for a Fortran front-end.

As a community, we should always consider all reasonable options. You should certainly feel free to make specific suggestions.

In my opinion (and this may be a matter of personal preference) a healthier subset of the compiler would be more of an end-to-end subset of it -- something that can be tested as a full product while it is being developed.

We always aim to commit functionality in independently-testable units. If the parser is implemented first, and that is accompanied by parser tests, that's a perfectly-reasonable approach. One reason why I've encouraged NVIDIA to start this RFC process as soon as possible is because we want to have community feedback on the design and the implementation (including the tests), and moreover, broaden the set of contributors. Maximizing that feedback and the pool of contributors essentially requires that the development happen within the community - it needs to be a community project.

That community, of course, includes all of us. If you would like to contribute to the design and/or implementation of a Fortran frontend within the LLVM project, you certainly should. As you've been working on a Fortran frontend project, I definitely encourage you to do so. All code contributed to LLVM is subject to code review by the community, both before and after it is committed.

And then there is also the argument for reusing Clang tooling, which David Greene keeps making, though that idea does not seem to get a lot of interest.

I disagree. There's been a lot of interest in modeling Flang's tooling after Clang's infrastructure, and refactoring for direct reuse where possible. In general, refactoring for code reuse is our default - developing similar functionality without reusing existing, related code is what, in general, requires specific justification.

-Hal

Best,

Petr