Summary of F18 OpenMP design

Hi Kiran,

Thank you for the detailed explanation and investigation of the OpenMP framework for lowering and code-generation. It all looks good to me, in particular the introduction of an OpenMP MLIR dialect and the placement of OpenMP IRBuilder in the overall scheme.

Brian’s proposal for OpenMP symbol tables says that explicit symbol table entries will be created for entities that have different attributes inside of OpenMP regions. E.g. if there’s an X in an outer scope that becomes thread private, then there’s an additional symbol table entry in the OpenMP region scope for that symbol. That new symbol has the additional OpenMP attributes and a link back to the original entity in the outer scope. We think this organization will simplify outlining.

I don’t recall much discussion about outlining. Do you think OpenMP IRBuilder will drive the decisions about when and what to outline? After outlining, will the compiler be able to run the outlined code back through the MLIR+ transformations and optimizations? Are there situations where the outlined code presents more opportunities for optimization?

  • Steve

Hello Steve,

Thanks for your mail.

Yes, we will use the OpenMPIR builder to outline. I have briefly mentioned this in the parallel construct example (including that below for your reference). Note that using the OpenMP IRBuilder will be the last step before LLVM IR generation. After this point, there is no going back to MLIR. Any transformation or optimisation in MLIR has to be done before this step. This can involve constant propagation and other code motion to reduce the bad effects that outlining brings. But this is new code that will have to be written.

Example 1: Parallel construct

  1. Example OpenMP code

!$omp parallel

c = a + b

!$omp end parallel

  1. Parse tree (omitted)

  2. The first lowering will be to a mix of FIR dialect and OpenMP dialect. The OpenMP dialect has an operation called parallel with a nested region of code. The nested region will have FIR (and standard dialect) operations.

Mlir.region(…) {

%1 = fir.x(…)

%20 = omp.parallel {

%1 = addf %2, %3 : f32

}

%21 =

}

  1. The next lowering will be to OpenMP and LLVM dialect

Mlir.region(…) {

%1 = llvm.xyz(…)

%20 = omp.parallel {

%1 = llvm.fadd %2, %3 : !llvm.float

}

%21 =

}

  1. The next conversion will be to LLVM IR. Here the OpenMP dialect will be lowered using the OpenMP IRBuilder and the translation library of the LLVM dialect. The IR Builder will see that there is a region under the OpenMP construct omp.parallel. It will collect all the basic blocks inside that region and then generate outlined code using those basic blocks. Suitable calls will be inserted to the OpenMP API.

define @outlined_parallel_fn(…)

{

%1 = fadd float %2, %3

}

define @xyz(…)

{

%1 = alloca float

call kmpc_fork_call(…,outlined_parallel_fn,…)

}

Regards,
Kiran