Status of Flang's Optimization

Hello.

I’m interested in Flang’s optimization and would like to join the development of it.
I’m trying to catch up the status of the development, but I can’t find some information I want.
So let me ask some questions here.

  • About Plan
    • Do you have roadmaps/milestones of the development of Flang’s optimization?
      (e.g. optimizations for Fortran-specific features, array/loop optimizations)
      • Are there any features you’re working on?
      • Are there any target applications for Flang?
  • About Design
    • Which do you think optimizations such as loop vectorization should be performed mainly in, LLVM or MLIR(FIR/HLFIR)?
2 Likes

Hello @yus3710-fj,

Nice to see your interest in Flang’s optimization.

There has been some work and interest in optimizations for Flang. This was primarily driven by a few benchmarks suites (like Spec, polybench, SNAP etc). As far as spec-2017 is concerned, the biggest outliers are with 527.cam4, 549.fotonik3d and 548.exchange2. The cam4 issue is likely to be fixed by HLFIR (by better insertion/removal of temp arrays for Array expressions), the fotonik issue by passing more alias information to LLVM, and the exchange2 issue by function specialization in LLVM (⚙ D145819 [FuncSpec] Increase the maximum number of times the specializer can run.).

At the moment, vectorization transformations are performed by LLVM. If there are cases where LLVM does not have enough information to do vectorization and that information is available in the FIR/HLFIR layers then we can think of performing vectorisation at the MLIR level.

In general, if you believe there is a transformation that will help Flang’s performance, you can write up a post in discourse and then proceed to implement if there are no serious reservations.

Reference:
Discussions

Performance Analysis

4 Likes

Thank you for your reply.

I checked the discussions and related issues.
Those are very helpful for me.

I’ll consider how I can contribute to Flang’s optimization.
(I’m thinking of looking into performance issues that have been already found.) Sorry, I misunderstood what the performance issues say.

1 Like

The rust-compiler team had a meeting about which optimizations should happen in the rust compiler and which are delegated to LLVM. It was roughly: if you can benefit from knowledge of rust semantics do it in the rust compiler.

As Fortran is about loops and arrays, there might be opportunities for optimizations in that area that LLVM cannot do.

1 Like

Hello.
Thank you for your comments.

At this moment, I’d like to join the implementation of alias analysis in FIR because it would be important for transformations in FIR.
I see that we’re waiting for “full restrict” in LLVM, but I don’t understand how alias analysis in FIR is going.
Please tell me some more information about it if you have.

Furthermore, we Fujitsu would like to work with you on Flang’s optimization actively, but it seems that Flang’s optimization is not high priority now.
I’ll see you in the next Technical Call and I’d like to discuss how we can work on it.
I’m afraid that I’m not an expert of Fortran and Flang at this point, but I’m happy to collaborate with you.

1 Like

@szakharin or @Renaud-K could help with status of Alias Analysis in FIR and other Flang Optimization work.

You could also try if you are interested:

1 Like

Hello @yus3710-fj,

Roughly, the “alias analysis” in Flang consists of two parts:

@jeanPerier, @tblah and myself have been working on HLFIR enabling, and we are now at the point that Polyhedron, CPU2000, CPU2006 and CPU2017 benchmarks are compiling and passing with test data sets (except for 628.pop2). Besides functional support for some Fortran features that are not (and will not be) available with FIR lowering, HLFIR is our path toward generating efficient MLIR. So the next step here would be to analyse HLFIR vs FIR performance and make sure we produce same or faster code with HLFIR. My initial measurements show that we are far from that, so there is quite a bit of work investigating the benchmarks and classifying the issues. It could be that a big portion of these issues might be resolved by improving the MLIR-level alias analysis. So you may consider investing your time into performance analysis of the benchmarks as well.

Thank you for your interest in improving Flang!

3 Likes

Thank you for your comments, and I apologize for the late response.
I’m thinking of working on MLIR-level alias analysis for FIR/HLFIR and performance analysis.

I’m afraid I’m not familiar with FIR/HLFIR, so I think the first thing I must do is to research the concept and implementation of FIR/HLFIR and understand them.
After that, I’d like to join the improvement of Flang in earnest.

1 Like

If you are looking for something to start with HLFIR then you can try adding an intrinsic as an operation in HLFIR. May something like MAXVAL (MAXVAL (The GNU Fortran Compiler)). You may use one of the existing intrinsic operations for reference. ⚙ D152521 [flang][hlfir] Add hlfir.count intrinsic

1 Like

Thank you for the information. I’ll try it.

Hello.

I started to investigate HLFIR performance, and I measured the performance of TSVC as a first step.
There was no problem in terms of alias analysis, but there was a weird performance issue. The assembly code of the innermost loop is the same as FIR lowering but the performance is reduced by 10% when HLFIR lowering is enabled.

I’ll share the information after organizing it.

2 Likes

I’ve created a new topic for TSVC.