RFC:

pat · December 10, 2020, 11:38pm

This RFC aims to address challenges and concerns (e.g., disjoint development activities) in providing a functional code base for Flang via the main branch of the llvm-project.

Background

When Flang upstreamed to llvm-project, only the frontend (preprocessor, parser, and semantic analysis) went into the main repo. This retained the commit history for the project and post hoc reviews of the upstreamed code were expected. The work on this front-end has continued in llvm-project using the smaller, incremental commits that is normal LLVM practice. The middle-end (lowering from parse trees to the MLIR dialect for the Fortran intermediate representation, FIR, that was not initially upstreamed) has been actively developed in a “fir-dev” fork:

https://github.com/flang-compiler/f18-llvm-project/tree/fir-dev

This codebase is heavily dependent upon MLIR for both Fortran-IR (FIR) and plans for an OpenMP dialect, and obviously on LLVM for the final lowering. Active work across all these components, and efforts to get a working Fortran 77 front-end implemented on a timely schedule, has expanded the "fir-dev” capabilities and has significantly extended the codebase as it currently exists in the main branch.

To date, the community has struggled to reach an agreement on how to best merge the critical functionality in “fir-dev” into the main branch. Unfortunately, the end targets of smaller commits and timely delivery of a working Fortran front end in the main llvm-project branch have been at odds. Furthermore, as time has passed this bifurcation has become detrimental to both Flang and established community practices. Therefore, we would like to propose the following approach to address and reduce the risks associated with the growing divergence across the code base.

Proposed Merge Strategy

The proposed strategy aims to provide a timely push of “fir-dev” capabilities into Flang’s main branch. Although not ideal, we feel it addresses both schedule risks and the overarching goal is to bring the community back together in working on a unified codebase. The proposal is to upstream groups of functionality with a revised history. One can imagine staging a sequence of multiple commits in git as a chain (say a handful of commits at most). This entire chain can be pushed in one go, creating a “history” or by splitting up the group into a few individual commits.

One potential grouping of functionality could be:

FIR dialect and code generation,
Optimization passes, and
Lowering to FIR.

We welcome other suggestions for these groupings.

Unlike previous proposals, we would like to encourage timeliness by avoiding any initial source code modifications beyond organizing the merging of code into these functional groups. We think there are two potential approaches to upstreaming this code in the master branch. The first would handle each group independently as a single or few smaller commits. It may be difficult to completely test each of these individual commits until the overall upstream process is completed, however we will, at the very least, make sure the code still builds appropriately.

Alternatively, we could eliminate the three functional groups and roll up all the “fir-dev” changes into a single commit. Within the broader LLVM effort, there have been cases where very large, cross-cutting changes have been committed in one shot to avoid breaking functionality. The middle stages of Flang’s functionality would meet a similar goal.

The primary advantage of the 2nd (“everything at once”) approach is that it would provide a more timely working (F77) implementation of Flang and avoid a continued risk of a much longer duration and continued bifurcation of efforts across repositories.

We encourage the community’s thoughts on these two paths, any alternative approaches, etc.

A series of important steps need to happen as part of the incorporation of fir-dev into the master repo:

We need to make certain that all build-bot functionality is sound and doesn’t negatively impact community-wide, LLVM regression checks. Our proposed approach would be to complete all this testing (by hand-running buildbots) before finally landing the fir-dev merged code upstream.
Make certain those who have contributed to fir-dev have their contributions maintained in the git history. This potentially will require that we temporarily disable LLVM’s commit mailer to avoid an onslaught of email traffic across the community. We need to understand how best to achieve this. One alternative would be to capture and start a contributors list as part of the project if there is no clean way to retain history.
It would be valuable to identify a small number of people from across the community (2-4 seems reasonable) to help coordinate and oversee this process – it is likely too much for a single person.
As a final step of the proposed process, the “fir-dev” repository will be archived and all future development activities will use the llvm-project main repository.

We believe with this process, and the fir-dev code successfully merged in, the first functional Fortran 77 front end, and middle stage(s), can be completed and provide the community with a starting point for refactoring, adding new capabilities, and exploring additional opportunities for contributing to Flang and leveraging the broader LLVM code base.

We look forward to your feedback, suggestions, additions, etc.

Thanks,

—Pat McCormick, Los Alamos National Laboratory
—Steve Scalpone, NVIDIA

mehdi_amini · December 10, 2020, 11:54pm

Hi,

I have strong concerns about having a “code drop” for the FIR components and anything that has a dependency on MLIR.
I’d like that any such component is upstreamed individually (each passes in a single review for example) and reviewed on Phabricator appropriately.

Thanks,

pat · December 14, 2020, 5:25pm

Thanks Mehdi. Efforts are underway for a plan/response to your feedback.

Please note that in replying to Mehdi, I’m also addressing my mistake on the original RFC’s subject line. My apologies for the oversight — hopefully this makes the RFC details a bit more clear.

—Pat

rovka · December 15, 2020, 10:33am

Hi Pat,

It’s great to finally see this happening! Having development split between two repos makes for a great barrier to entry for newcomers to the project, so the sooner it is addressed the better.

On the other hand, I second Mehdi’s concerns with regards to one huge code drop. Splitting into at least the 3 categories that you mention in your email is probably a must. From there, you can probably find some kind of middle ground - for instance for areas where you might expect feedback from the larger community (e.g. MLIR-related stuff), it makes sense to split into smaller patches. For other areas, if you feel that the content is only understandable to the flang community and has already gone through a good round of review when merged into fir-dev, then you can probably send larger patches and people can discuss in Phabricator if they think anything needs further splitting.

I’m currently maintaining 4 buildbots that build flang on aarch64, so please keep me in the loop for pre-commit testing of the patches (however many they may be).

As a slightly orthogonal topic - It would also help to migrate any known issues/bug reports from the fir-dev fork to their proper upstream place. Not sure if there’s a way to automate this (or if it’s even necessary for the number of issues open there) and it can surely be handled after the code migration is complete. But it’s good to have it on the radar.

Cheers,
Diana

Topic		Replies	Views
RFC: 'fir-dev' merge with llvm-project/flang main repo Flang	3	94	January 20, 2021
RFC: Merging FIR Flang	4	120	February 27, 2020
RFC for f18+runtimes in LLVM LLVM Dev List Archives	38	159	March 12, 2019
F18 upstreaming Finished! LLVM Dev List Archives	8	85	April 12, 2020
[RFC] LLVM Project Blog post for flang-new -> flang renaming Flang	18	887	January 6, 2025

RFC:

Related topics