Thanks, Eric for the clarification.
Also, sharing this write up of the flow through the compiler for an OpenMP construct. The first one (Proposed Plan) is as per the presentation. The second one (Modified Plan) incorporates Eric’s feedback to lower the F18 AST to a mix of OpenMP and FIR dialect.
I Proposed plan
-
Example OpenMP code
!$omp parallel
c = a + b
!$omp end parallel
-
Parse tree (Copied relevant section from -fdebug-dump-parse-tree)
| | ExecutionPartConstruct → ExecutableConstruct → OpenMPConstruct → OpenMPBlockConstruct
OmpBlockDirective → Directive = Parallel
OmpClauseList →
BlockExecutionPartConstruct → ExecutableConstruct → ActionStmt → AssignmentStmt
Variable → Designator → DataRef → Name = ‘c’
Expr → AddExpr → Designator → DataRef → Name = ‘a’
Expr → Designator → DataRef → Name = ‘b’
OmpEndBlockDirective → OmpBlockDirective → Directive = Parallel
-
The first lowering will be to FIR dialect and the dialect has a pass-through operation for OpenMP. This operation has a nested region which contains the region of code influenced by the OpenMP directive. The contained region will have other FIR (or standard dialect) operations.
Mlir.region(…) {
%1 = fir.x(…) …
%20 = fir.omp attribute:parallel {
%1 = addf %2, %3 : f32
}
%21 =
…
} -
The next lowering will be to OpenMP and LLVM dialect. The OpenMP dialect has an operation called parallel with a nested region of code. The nested region will have llvm dialect operations.
Mlir.region(…) {
%1 = llvm.xyz(…) …
%20 = omp.parallel {
%1 = llvm.fadd %2, %3 : !llvm.float
}
%21 =
…
} -
The next conversion will be to LLVM IR. Here the OpenMP dialect will be lowered using the OpenMP IRBuilder and the translation library of the LLVM dialect. The IR Builder will see that there is a region under the OpenMP construct omp.parallel. It will collect all the basic blocks inside that region and then generate outlined code using those basic blocks. Suitable calls will be inserted to the OpenMP API.
define @outlined_parallel_fn(…)
{
…
%1 = fadd float %2, %3
…
}
define @xyz(…)
{
%1 = alloca float
…
call kmpc_fork_call(…,outlined_parallel_fn,…)
}
II Modified plan
The differences are only in steps 3 and 4. Other steps remain the same.
-
The first lowering will be to a mix of FIR dialect and OpenMP dialect. The OpenMP dialect has an operation called parallel with a nested region of code. The nested region will have FIR (and standard dialect) operations.
Mlir.region(…) {
%1 = fir.x(…) …
%20 = omp.parallel {
%1 = addf %2, %3 : f32
}
%21 =
…
} -
The next lowering will be to OpenMP and LLVM dialect
Mlir.region(…) {
%1 = llvm.xyz(…) …
%20 = omp.parallel {
%1 = llvm.fadd %2, %3 : !llvm.float
}
%21 =
…
}
Thanks,
Kiran