MLIR Support for Sparse Tensors

I have been asked offline to give a bit more detail on how to use the sparse compiler (in its current state, note that many refinements are still ongoing!).

First you will need to patch the pending revision or wait until that one has been submitted.

Then, define your kernel using the Linalg dialect as if all tensors are dense, but annotate the ones that are sparse. For a very simple sum-reduction, this would require the following trait.

#trait_sum_reduce = {
  indexing_maps = [
    affine_map<(i,j) -> (i,j)>, // A
    affine_map<(i,j) -> ()>     // x (out)
  ],
  sparse = [
    [ "D", "S" ], // A
    [          ]  // x
  ],
  iterator_types = ["reduction", "reduction"],
  doc = "x += A(i,j)"
}

Which informs the sparse compiler to treat matrix A with a dense outer dimension and a sparse inner dimensions (dense-sparse effectively yields a CSR format; other possibilities are sparse-sparse for DCSR format, sparse-dense for a dense-row-only format, and dense-dense for a completely dense matrix; permuting the dimensions is still TBD). Then define the kernel, where we don’t know the dimensions of the tensor at compile-time (indicated by “?x?xf64”). Also, since MLIR does not support a SparseTensor type natively, some glue is required to feed the argument back into tensor land.

!SparseTensor = type !llvm.ptr<i8>

  func @kernel_sum_reduce(%argA: !SparseTensor,
                          %argx: tensor<f64>) -> tensor<f64> {
    %arga = linalg.sparse_tensor %argA : !SparseTensor into tensor<?x?xf64>
    %0 = linalg.generic #trait_sum_reduce
      ins(%arga: tensor<?x?xf64>)
      outs(%argx: tensor<f64>) {
      ^bb(%a: f64, %x: f64):
        %0 = addf %x, %a : f64
        linalg.yield %0 : f64
    } -> tensor<f64>
    return %0 : tensor<f64>
  }

Then in the calling method, first set up a sparse matrix storage scheme from file as follows. Here the information passed in the annotations array must match the one above (so dense-sparse in this case). In the future, this also will of course be matched automatically, but lacking a full type system, this kind of glue is still required.

!Filename     = type !llvm.ptr<i8>

    %annotations = alloc(%c2) : memref<?xi1>
    %sparse = constant true
    %dense  = constant false
    store %dense, %annotations[%c0] : memref<?xi1>
    store %sparse, %annotations[%c1] : memref<?xi1>

    // get full filename from ENV
    %fileName = call @getTensorFilename(%c0) : (index) -> (!Filename)
 
    %a = call @newSparseTensor(%fileName, %annotations)
        : (!Filename, memref<?xi1>) -> (!SparseTensor)

The value %a contains an opaque pointer that can be passed into the sparse kernel.

    %0 = call @kernel_sum_reduce(%a, %x)
      : (!SparseTensor, tensor<f64>) -> tensor<f64>

Then you can invoke the sparse compiler with bufferization and lowering to LLVM as follows (more flags anyone? :-).

mlir-opt sparse_kernel.mlir  \
  --test-sparsification="lower" \
   --convert-linalg-to-loops \
   --func-bufferize --tensor-constant-bufferize --tensor-bufferize --finalizing-bufferize  \
   --convert-scf-to-std --convert-vector-to-llvm --convert-std-to-llvm > sparse.ll

And run this with JIT on a CPU as

TENSOR0="<full path to tensor file in Matrix Market or FROSTT format>" 
mlir-cpu-runner -e entry -entry-point-result=void  sparse.ll

That’s it. For comparison, here are the running time for running this on a Matrix Market 16,614x16,614 sparse matrix fidap011 with 1,091,362 nonzero elements for the four possible ways of annotating the matrix as sparse/dense (note that permuting the dimensions TACO-style is still TBD).

RUNTIME         microsecs
dense-dense      282,603
sparse-dense     282,734
dense-sparse       1,052       (huge drop due to exploiting sparsity)
sparse-sparse      1,074       (no more benefits, all rows are filled)

I hope this brief explanation is useful (you can find the example in the revision above).

4 Likes