[RFC] Enabling the HLFIR lowering by default

Hello,

I would like to discuss enabling HLFIR lowering by default in Flang soon. High level MLIR operations for Flang were proposed by @jeanPerier in [RFC] Add higher level operations in FIR and ⚙ D134285 [flang][RFC] Adding higher level FIR ops to ease expression lowering. Accompanied by a new FIR value type (hlfir.expr), the new representation (HLFIR) allowed simplifying existing lowering PFT to MLIR, and also helped supporting Fortran features that were hard to support with the direct PFT-to-FIR lowering.

For example, CPU2006/465.tonto and CPU2017/621.wrf_s,628.pop2_s benchmarks are now successfully compiling with the HLFIR lowering, while hitting unimplemeted TODOs with the FIR lowering. Some polymorphic types cases are also supported only with the HLFIR lowering. In our internal testing we see about 600 new tests passing with the HLFIR lowering, while the amount of tests that work with the FIR lowering and do not work with the HLFIR lowering is 21.

On the performance side, the HLFIR lowering also provides benefits such as allocating much smaller temporary arrays (e.g. in Polyhedron/test_fpu2 and CPU2017/527.cam), and applying more complex post-lowering “conflict” analysis for LHS and RHS parts of the array assignments (e.g. see @tblah’s work in ⚙ D157107 [flang][hlfir] add an optimized bufferization pass).

With all the work that has been put into the HLFIR lowering, I think we are almost ready to make the HLFIR lowering the default, but I would like to set several short-term goals that we have to achieve first. Please let me know if you want to add something in the list or something looks off to you.

C1: Keep Flang lowering covered with LIT tests as much as possible without delaying HLFIR enabling too much.

  1. The HLFIR lowering is covered by LIT tests in flang/test/Lower/HLFIR. While there are fewer tests than for the FIR lowering (in flang/test/Lower), I think we may switch to the HLFIR lowering without migrating all the tests. The complete migration from flang/test/Lower to flang/test/Lower/HLFIR may be done after the switch. The quality of LIT testing will degrade in the menatime, but not much, given that the HLFIR lowering has decent amount of LIT tests as well.
  2. I would like to adjust flang/test/Lower/OpenACC tests to work with the HLFIR lowering (e.g. as done in ⚙ D157560 [flang][hlfir] Propagate acc.declare from hlfir.declare to fir.declare. for a single test). I consider this a blocker for switching to the HLFIR lowering, and I will be working on this.
  3. I suppose, teams actively working on OpenMP will want to do the same for flang/test/Lower/OpenMP tests, so that the testing coverage does not degrade too much. I would consider this a blocker as well, but I have no resources to work on that. I would greatly appreciate help with this.

C2: Keep end-to-end testing in “good shape”:

  1. As already mentioned, SPEC CPU 2000, 2006 and 2017 benchmarks compile and run successfully (w/o OpenMP) with the HLFIR lowering. We just need to keep it the same until we switch to the HLFIR lowering.
  2. Some tests from llvm-test-suite only work with the HLFIR lowering (https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/EnabledFilesHLFIR.cmake), but some tests do not work either hitting a TODO or showing incorrect behavior in runtime (https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/DisabledFilesHLFIR.cmake). I would say that the disabled tests might be resolved after the switch to the HLFIR lowering, so we can just keep llvm-test-suite lists untouched. If you think that some of the disabled tests must be fixed before the switch, please let me know and I will work on them.
  3. I do not have information about the SPEC CPU benchmarks or other tests that use OpenMP and won’t work with the HLFIR lowering. If you think we need to add some of them as blockers for the switch, please let me know.
    • Many thanks to @kiranchandramohan who has already started fixing bugs with HLFIR lowering for OpenMP!

P1: Keep performance on par with the FIR lowering (preliminary performance data (multiple tabs): HLFIR-perf.xlsx - Google Sheets).

  1. I would like to keep SPEC CPU2006 and CPU2017 ref data set 1 copy performance on par with the FIR lowering, meaning we either achieve the same performance with the HLFIR lowering right away, or have exact explanation of the performance regressions not exceeding 25% and agree on the plans for fixing each of them. @tblah is pretty close on resolving some part of the huge regression of 548.exchange2 (other benchmarks will be improved as well). After that we should be able to do additional analysis and define our next steps. Please let me know if switching to the HLFIR lowering with pending performance regressions is not acceptable for your team.
  2. I would like to have Polyhedron performance issues addressed the same way as SPEC CPU above (i.e. regressions of more than 25% regressions are not acceptable). Basically, we have to resolve inefficiencies in temporary arrays creation and array assignments. These issues will affect majority of Fortran programs, so switching to the HLFIR lowering with slowing down the rest of the world does not sound right.

These are all the checkboxes that I want to check before we switch. Please share your thoughts. Thank you!

Slava

5 Likes

Thanks @szakharin for sharing this RFC. Switch to HLFIR is exciting! Thanks to you, @tblah and @jeanPerier and others (@cabreraam, Jacob, …) who participated in the HLFIR work.

From our side the blockers are:
1.
We would like Spec2017 rate benchmarks to not degrade overall with the switch. At the moment, we see the following (with -Ofast -flto -mcpu=native) compared to FIR lowering as per results collected by @tblah. I think once the 554.roms benchmark is fixed, we are good to go.

Benchmark Slowdown Comment
503.bwaves 1.00
521.wrf 1.16
527.cam4 0.71
548.exchange2 1.18
549.fotonik3d 1.01
554.roms 1.56 Loop-versioning pass is not working with HLFIR lowering
Geometric Mean 1.07
  1. BLAS/LAPACK should continue to work. We are in the process of checking where we are with this.

For OpenMP support we would like:

  1. all OpenMP 1.1 features to continue to work after the switch. Currently there are issues with reduction, atomic, threadprivate support for common blocks.
  2. No regressions with gfortran-testsuite, NPB, spec-2017 speed, spec-omp-2012, SNAP
  3. Reasonable LIT test coverage.
  4. Elementary GPU offloading to continue to work.

Thanks for sharing these Kiran. My numbers were measured with ⚙ D157107 [flang][hlfir] add an optimized bufferization pass applied.

1 Like

Thank you for sharing your requirements, @kiranchandramohan! If you do not mind, I will add them into the list in the description, so that we have all the pieces together.

Can you please clarify the requirement for “no regressions with gfortran-testsuite”? Is it a hard one or you will be okay with switching to HLFIR lowering having the failures from https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/DisabledFilesHLFIR.cmake?

1 Like

+1

I was talking about the OpenMP tests here. These should be enabled. There are only a few of them now and the second list below is already empty.
OpenMP Appendix : https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/appendix-a/DisabledFilesHLFIR.cmake
OpenMP : https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/DisabledFilesHLFIR.cmake

1 Like