I and one of my coworkers have revisions pending that’ve been sitting around for a while. I’d like to bring them to folks’ attention here, especially since it’s quite possible that automatic reviewer assignment picked the wrong people.
I know folks might be unavailable or less available because of the holidays, so I’m fine with the reviews happening, say, early next year. If that’s what does happen, I hope this post will serve as a reminder then.
The larger of these revisions is ⚙ D139739 [mlir] [tosa] Add a pass that partitions TOSA code into kernels that consist of a conv2d or similar anchor op and adjacent elementwise ops., the TOSA partitioner. This adds a pass that takes a tosa model and splits it up into kernels that consist of a convolution or matmul (the set of ops in configurable) and surrounding elementwise operations. This partitioning allows compilers to group parts of a model into kernels that can then be sent into a further compilation flow, and we believe that other MLIR users will find it useful.
The other set of revisions is ⚙ D139865 [mlir][GPU] Add known_block_size and known_grid_size to gpu.func and ⚙ D139866 [mlir][ROCDL] Translate known block size attributes to ROCDL, which allow annotating GPU functions with the fixed set of block and/or grid dimensions they will use if those are known. For example, if a kernel is formed by outlining the body of a gpu.launch
that had constant operands in its blocks
and grid
operands, we can attach those constants to the kernel and use that information during optimizations. The second revision threads the value of gpu.known_block_size
and gpu.known_grid_size
through to the AMDGPU backend, setting attributes that will allow the compiler to optimized based on the known sizes and that will, in the block size case, ensure that the specified block sizes are actually used. I expect this to produce minor optimizations, especially if the other GPUTo* passes are updated to account for the size information.