This Thursday (9am California Time, 16:00 UTC ), Navdeep Katel (Indian Institute of Science and PolyMage Labs) will present early results about high performance code generation for GPU tensor cores using MLIR.
As usual the information to join the meeting:
+1 218-301-8485 PIN: 255 745#
I’ll also update this thread with slides and recording after the meeting.
- 2021-08-26: High Performance GPU Tensor CoreCode Generation for Matmul Using MLIR ; slides - recording
Seems the recording link is private video as of now, I might be getting overzealous about sharing it with others so sorry if I’m jumping the gun
Thanks for reporting, I fixed the recording visibility now!
This report is a very exciting work to enable MLIR support tensor core.
And I have a question about how to re-do the Matmul experiment of this work, especially about the tensor core part.
Should I do the experiment of this work using the lastest MLIR or the MLIRX repo?
Please use the latest MLIR tree from the official git repo. The GPU wmma ops needed (and some more with generalizations) are available therein. Not all of the passes used in the report are upstream but you should be able to experiment and run through some examples for JIT-based execution.
Thank u very much for the quick rely.
And one more question about the 2-level loop tiling about this work.
If I use the latest MLIR repo to run the experiment of matmul with tensorcore, how should I implement the 2-Level loop tiling.
I have think of two method. The first method is use the affine-loop-tile pass with the tiling size. The second one is like your hop paper to construct a matmul_tiled.mlir that has already been tiled. Which one should I choose?
I have the same question. In the report, it seems that all the transformations were written in MLIR. However, I haven’t found the
.mlir file that contains those transformations. Did I miss something?