Some more 2022 news, as was announced at the end of the Sparse Tensors in MLIR thread. After revision D117850, MLIR’s sparse compiler reference implementation now supports full sparse tensor I/O for proper testing and comparison (on a target with a file system, this means file I/O, but the ops are kept a bit more general to support different kinds of I/O in the future).
Consider, for example, the following PyTACO program.
csr = pt.format([dense, compressed], [0,1])
...
A = pt.read('A.mtx', csr)
B = pt.read('B.mtx', csr)
C = pt.tensor((A.shape[0], B.shape[1]), csr)
i, j, k = pt.get_index_vars(3)
C[i,j] = A[i,k] * B[k,j]
pt.write("C.tns", C)
Now, this can be represented in MLIR as follows.
%A = sparse_tensor.new %srcA : !Filename to tensor<?x?xf64, #CSR>
%B = sparse_tensor.new %srcB : !Filename to tensor<?x?xf64, #CSR>
%C = linalg.matmul ins(%A, %B: tensor<?x?xf64, #CSR>,
tensor<?x?xf64, #CSR>)
-> tensor<?x?xf64, #CSR>
sparse_tensor.out %C, %destC : tensor<?x?xf64, #CSR>, !Filename
Now we can compare the result computed by the TACO compiler with the result computed by the MLIR sparse compiler.
For example, given input “sparse” matrices
[ 1 2 3 ] [ 10, 11, 12 ]
A = [ 4 5 6 ] B = [ 13, 14, 15 ]
[ 7 8 9 ] [ 16, 17, 18 ]
Then both PyTACO and MLIR generate the following output file “C.tns” (except that MLIR uses the extended FROSTT format).
; extended FROSTT format
2 9
3 3
1 1 100
1 2 107
1 3 114
2 1 201
2 2 216
2 3 231
3 1 318
3 2 342
3 3 366
Happy testing!