This Thursday, October 6th (9am California Time, 16:00 UTC), @krzysz00 will be presenting the coordinate transformations system used by AMD’s high-performance GPU code generator at the core of the rocMLIR project, which is currently focused on convolution and matrix multiplication kernels.
The core of the system is a variation on affine maps that includes additional metadata about their semantics. This allows rocMLIR to represent concepts that cannot be expressed with affine maps, such as implicit padding, and, conversely, enables simplified static analysis of maps in situations such as bounds check elimination and vectorization. In addition to presenting these variant maps, they will describe the transforming_for looping construct and how it enables more efficient code generation through improved loop unrolling.
Their aim is both to highlight code patterns that do not appear to be easily expressible in MLIR’s core dialects and to get feedback about infrastructure in the core that they may be failing to take advantage of.
Meeting ID: 851 5109 0498