The post is aimed at starting a discussion on the design and implementation of the tile and unroll directives in the OpenMP dialect. The specification of the loop transformation constructs can be found in Section 9 of the OpenMP 5.2 specification document.
After an initial discussion with the Flang OpenMP community ( thanks @kiranchandramohan from ARM), I came up with the following definition (Only for tile, the same can be done for unroll) :
def TileLoopOp : OpenMP_Op<"tile", [AttrSizedOperandSegments,
AutomaticAllocationScope, RecursiveSideEffects,
AllTypesMatch<["lowerBound", "upperBound", "step"]>,
ReductionClauseInterface]> {
let summary = "tile construct";
let description = [{
}];
let arguments = (ins Variadic<IntLikeType>:$lowerBound,
Variadic<IntLikeType>:$upperBound,
Variadic<IntLikeType>:$step,
UnitAttr:$inclusive,
Variadic<IntLikeType>:$tilesize);
let regions = (region AnyRegion:$region);
let assemblyFormat = [{
`size` $tilesize
`for` custom<LoopControl>($region, $lowerBound, $upperBound, $step,
type($step), $inclusive) attr-dict
}];
let hasVerifier = 1;
}
However, there are cases that need to be considered in order to streamline the up definition. One such case ( from @kiranchandramohan ) is (Fortran code) :
!$omp tile
!$omp unroll
do i=1,100
end do
Then we have :
#pragma omp tile size (4,4)
#pragma omp tile size (5, 16)
for (int i = 0, i < 100; i++)
for (int j =0; j <128; ++j)
A[i][j] = i*1000 + j;
Should the nesting be handled through a parent-child relationship (ParentOpType)? I also looked into other dialects. Transformation support loop transformation. I am not sure; can we benefit from this support?