What is the motivation behind how OpenMP constructs are lowered to FIR? For those constructs that contain executable code (like loops or critical), we seem to create some skeleton in Lower/OpenMP.cpp, then fill it out in Lower/Bridge.cpp, e.g.
Why not have the OpenMPConstruct be lowered entirely (edit: inside of genOMPConstruct), instead of doing it in parts? Was there a reason for that, or did it just happen this way?
Because the Lower/OpenMP.cpp file is mainly dealing with the creation of OpenMP operations from the OpenMP dialect. Construct that have region needs to use the FIR lowering to lower the Fortran code inside those region.
Right, but why doesnāt the lowering code in OpenMP.cpp use FIR lowering from Bridge? The OMP and non-OMP code is intertwined, so it would make sense to lower the body of a block from an OpenMP construct that contains that block.
Iām not sure I get what you mean here. The Fortran code contained in an OpenMP construct is lowered by the Bridge. It is possible that the code in Lower/OpenMP.cpp produces some FIR operations but thatās OpenMP specific. The OpenACC lowering has the same design.
Right, but why doesnāt the lowering code in OpenMP.cpp use FIR lowering from Bridge?
Visible in OpenMP.cpp (and other dialect-specific files), yes.
That would help avoid the saving/restoring/resetting of the insertion point. It would also provide a natural context for lowering nested constructs (specifically in target, which requires special handling of host symbols, for example).
That would help avoid the saving/restoring/resetting of the insertion point.
You would do that anyway because you need to move the builder inside the region when you generate operation with region and then put the builder after the op once the region has been generated.
For this particular case, I think this is needed because privatization is done during lowering and it has no representation in the dialect. In OpenACC we donāt have such stuff.
Anyway I donāt have a strong opinion on this but if you want to change this you probably need to open a RFC and get feedback from the people contributing to the OpenMP lowering.
Basically, if we create anything inside the region then we will have to set/reset the insertion point. For eg. I think omp.target operations are implemented as isolated from above with block arguments. The
block arguments are the new bindings for symbols and have hlfir.declares generated for them in the region. If our representation for privatisation is to use block arguments in the region of privatisation they will have the same issue I think. For privatisation specifically, we will move to some higher level representation for tasks.
@kparzysz I donāt know whether you are facing any specific issue or this was a question just for information.
I want to change it to do what I described aboveāfor each construct, generate the whole thing recursively, instead doing it one part at the time. I think it would make the code much clearer, but I wanted to see if there are any compelling reasons for the current approach.
Iām actually working on dealing with firstprivate inside of target, and the privatization inside DataSharingProcessor implementation doesnāt handle it right (at least in my simple testcase), specifically due to the āisolation from aboveā.
One reason is to keep the lowerings separate and to disallow OpenMP code generation performing generation of Fortran constructs.
What is your proposal here?
ā Move genOmp functions to FirConverter ?
ā Derive from the FirConverter to create an OpenMPFIRConverter? And create the right converter based on whether the -fopenmp flag is set?
ā Or something else?
Will recursive generation fix the issue fully for you? Or will you have to make large scale changes to the DataSharingProcessor? Since the āisolated from aboveā situation leads to the usage of block-arguments, you could also consider representing this in the dialect. Trivial types (scalars, static-arrays) can be allocaād later while interfacing with the OpenMPIRBuilder, others will need more information which can be based on the OpenACC recipe generation or wrappers to createHostAssociateVarClone and copyHostAssociateVar procedures.
Iāve thought of the second approach, i.e. converter subclass, or perhaps some other mechanism to allow combining additional converters in the future.
I donāt have a specific proposal yetāmy first goal here is to understand why the code is written this way to make sure Iām not missing something that is not evident in the code. In the first sentence I quoted above you wrote that disallowing OpenMP lowering to generate Fortran constructs was intentionalāwhat was the motivation for this?
The way I see generating code for omp target is something like this (roughly):
genOpenMPTarget(OpenMPConstruct &omp) {
localSymbols.pushScope(IsolatedFromAbove); // indicate that it's isolated
process-directives();
genOp<omp::TargetOp>(...)
genFIR(eval); // generate the contents recursively
localSymbols.popScope();
}
If we overload symbol handling (lookup, cloning, etc.) in the extended converter, we could generate appropriate code on demand, instead of having to pre-collect lists of symbols. The handling would then do the right thing for the context (i.e. generating symbol clones across target boundary, vs. on the same host).
The (meta) benefit of doing lowering ārecursivelyā is that the code organization follows the structure of the IR being lowered.
That follows from the intention to keep the OpenMP lowering separate from the Fortran lowering. If we think about the other implementation options which are lumping all OpenMP codegen into FIRConverter, or having further derived classes that need to be selected, I think the current organization is a good choice.
I will tag @schweitz to check whether the reasons are different.
We have FirOpBuilder that extends mlir::OpBuilder. At the same time weāre keeping the building of FIR separate from building the OpenMP extensions to it. Itās the same principle in both cases.
True, but weāre lowering a post-processed Fortran parse tree into [HL]FIR. The OpenMP lowering still relies on functions/types from both, so itās not āpureā in that sense.
What I want is to call the FIR builder from the OMP lowering code. The FIR generation would still be encapsulated in the FirOpBuilder, it would just be called from OpenMP.cpp.
FWIW we have no issue with handling firstpriavte for OpenACC with the currrent lowering and this is mainly due because we have a representation of it in the dialect so the heavy lifting is not done during lowering. So you might want to explore this path too.