Sparse Tensors in MLIR

Hi Arun,

Note that there is already a lowering to library calls (mainly cuSPARSE, but any external sparse library in principle). The MLIR sparsifier has two ways of generating GPU code, which we call “codegen” and “libgen”. Both paths can be found in SparseGPUCodegen.cpp:

“codegen”: populateSparseGPUCodegenPatterns
“libgen”: populateSparseGPULibgenPatterns

Both paths are there for experimentation: the direct codegen path that converts outermost parallel loops to kernel methods, and the libgen path that converts common operation (SpMV, SpMM, spGEMM, 2:4 SpMM, SDDMM) into GPU specific ops that are lowered into library calls later. You can find that in e.g. GPUToLLVMConversion.cpp with wrappers in CudaRuntimeWrappers.cpp.

You can enable the libgen path as follows (for any nonzero value, the codegen path is taken):

--sparse-gpu-codegen="num-threads=0"

You can find more information and examples in this prior GPU posting.

Overall, I think the libgen path performs much better, since codegen never took off. So you are of course welcome to improve either the codegen or libgen path (or both), but please don’t add libgen to codegen, that would break the underlying design.