Hi all,
We gave a presentation [1] a few months back regarding the OpenMP design for F18 during the Flang/F18 technical call, also sent a summary mail [2] and a set of walkthroughs [3] to the mailing list. We received some feedback and have incorporated that into the design. A summary of the design and status of the OpenMP implementation in F18 from lowering of the parse tree to LLVM IR generation is presented in this mail. (Note that Semantic and Structural checks are not covered. Refer to Gary’s biweekly mail for the status.)
The proposed design can be seen in slide 10 of the presentation [1]. The design summary is as follows.
i) Uses the following two components.
a) MLIR [4]: Necessary since MLIR has already been chosen as the framework for building the Fortran IR (FIR) for Flang. By using MLIR for OpenMP we have a common representation for OpenMP and Fortran constructs in the MLIR framework and thereby take advantage of optimisations and avoid black boxes.
b) OpenMP IRBuilder [5]: For sharing of OpenMP codegen with Clang.
ii) Current and Proposed Flow
a) The current sequential code flow in Flang (Slide 5) of the presentation [1] can be summarised as follows,
[Fortran code] → Parser → [AST] → Lowering → [FIR MLIR] → Conversion → [LLVM MLIR] → Translation → [LLVM IR]
b) The modified flow with OpenMP (Slide 10) will have lowering of the AST to a mix of FIR and OpenMP dialects. These are then optimised and finally converted to mix of OpenMP and LLVM MLIR dialects. The mix is translated to LLVM IR using the existing translation library for LLVM MLIR and the OpenMP IRBuilder currently under construction.
[Fortran code] → Parser → [AST] → Lowering → [FIR + OpenMP MLIR] → Conversion → [LLVM + OpenMP MLIR] → Translation (Use OpenMP IRBuilder) → [LLVM IR]
c) The MLIR infrastructure provides a lot of optimisation passes for loops. It is desirable that we take advantage of some of these. But the LLVM infrastructure also provides several optimisations. So there exist some questions regarding where the optimisations should be carried out. We will decide on which framework to choose only after some experimentation. If we decide that the OpenMP construct (for e.g. collapse) can be handled fully in MLIR and that is the best place to do it (based on experiments) then we will not use the OpenMP IRBuilder for these constructs.
iii) OpenMP MLIR dialect
The proposed plan involves writing an MLIR dialect for OpenMP. Operations of the dialect will be a mix of fine and coarse-grained. e.g. Coarse: omp.parallel, omp.target, Fine: omp.flush. The operations in MLIR can have regions, hence there is no need for outlining at the MLIR level. While the detailed design of the dialect is TBD, the next section provides links to walkthrough examples which provides a summary of the full flow as well as use of MLIR operations for OpenMP directives, and attributes for representing clauses which are constant. The proposed plan involves a) lowering F18 AST with OpenMP directly to a mix of OpenMP and FIR dialects. b) converting this finally to a mix of OpenMP and LLVM dialects. This requires that OpenMP dialect can coexist and operate with other dialects. The design is also intended to be modular so that other frontends (C/C++) can reuse the OpenMP dialect in the future.
iv) Examples
A few walkthroughs have been sent before [3]. These walkthroughs illustrate with an example, the flow for a few constructs (parallel, target, collapse, simd). For the parallel and target constructs will use the OpenMP IRBuilder for lowering to LLVM IR. MLIR offers infrastructure to do loop transformations hence for the collapse clause the transformation is done inside the MLIR framework. While both LLVM and MLIR offers infrastructure for vectorisation the LLVM vectoriser is more mature and hence LLVM is preferred. For details refer to the links [3].
v) Progress
i) OpenMP MLIR
→ Sent a few walkthroughs which illustrate the flow from AST to LLVM IR.
→ First patch [6] which registers the OpenMP dialect with MLIR has been submitted.
→ Design of a minimal dialect with a single construct (barrier) is in progress.
ii) OpenMP IRBuilder
→ @Doerfert, Johannes has a series of patches [7] introducing preliminary support for the OpenMP IRBuilder which are either approved or under review. The initial set adds support for the parallel and barrier construct.
→ Others (Roger Ferrer, Kiran) have tried it for other constructs like taskwait and flush.
vi) Next Steps
→ Send the design to the MLIR mailing list to get approval and to enable code review.
→ Implement the accepted plan on a construct by construct basis starting with the parallel construct.