Note that this really is a continuation of two previous threads, Sparse Representation and MLIR Support for Sparse Tensors, but now concretely focusing on adding a sparse tensor type as first-class citizen to core MLIR (viz. a SparseTensorType
residing in IR/BuiltinTypes
). The considerations below are meant to restart the discussion on a proper design.
The proposal is based on the conviction that sparsity should merely be a property of tensor, not a tedious implementation detail. Starting with a program or kernel that purely operates on tensors as mathematical abstractions, annotating some of the tensors as “sparse” by means of this new type (including hints for a suitable storage scheme) will enable a sparse compiler to lower the higher level abstraction into actual sparse code that takes advantage of sparsity to reduce computational and storage requirements. This approach was pioneered in the source-to-source sparse compiler MT1 for linear algebra and formalized to tensors in the Tensor Algebra Compiler (@fredrikbk). The COMET project (@Gkestor) used this idea for a DSL compiler.
The feasibility within MLIR starting from a Linalg operation has been recently demonstrated with a first prototype, but yet without proper type support. This work was recently presented at ODM (slides and recording). Adding the sparse tensor type would enable this prototype to gradually develop into a full end-to-end solution, where tensors are annotated at source level and progressively lowered into efficient code by MLIR before handing it over to LLVM.
Meta-data
The sparse tensor type should at least carry the following information.
(1) Similar to a regular tensor, the type should define shape, rank, dimension sizes (including dynamic sizes), and element type. The unranked variant is probably not needed.
Examples
tensor<32x128xf64>, tensor<?x?xf32>, tensor<4x?x8xbf16>
(2) A per-dimension annotation of sparse or dense, which covers the specification of many sparse storage schemes already. Blocked variants can be expressed with synthetic dimensions. More advanced storage schemes would become possible by extending the annotations in the future.
Example
for a 3-dim tensor, 8 combinations, from dense/dense/dense to sparse/sparse/sparse
(3) Order of storage on the dimensions. Unlike dense tensors, which implicitly declare a row-major or column-major order on the dimension storage, for sparse storage schemes the preferred order of access should be made explicit. An affine map would suffice for this.
Example
for a 2-dim tensor,
(i,j)->(i,j) would define row-wise storage, and
(i,j)->(j,i) would define column-wise storage
(4) Preferred bit-width for the pointers and indices in the underlying sparse storage scheme. Unlike dense tensor storage, which consists of primary storage only (viz. the numerical values), sparse storage schemes may contain some overhead storage to reconstruct the actual tensor. Although 64-bit could be used for the pointers and indices in such overhead storage, more storage savings could be obtained with narrower bit-widths (sufficient to implement the indirection of the number of nonzeros at each level for pointers and sufficient to reconstruct the index in each dimension). Note that although we could use a per-dimension approach here as well (viz. 32-bit in the first dimension, 64-bit in the second), it probably suffices to simply define a single separate bit-width for pointers and indices.
Example: 32-bit pointers, 64-bit indices
Implementation
One possibility would be to use a sparse_tensor “is-a” tensor relation so that any operation on tensors would work for sparse tensors too. Since all passes will have to learn how to handle and possibly propagate the meta-data, however, a better choice may be to simply introduce sparse tensors as a new type, with only limited reuse of some of the tensor/shape types.
Syntax
A possible syntax would define a rank-2 32x16 matrix with f32 elements, the 16-size dimension sparse, column-wise access, and types i32 for pointers and index for indices as follows.
sparse_tensor<32x16xf32, DxS, (i,j)->(j,i), i32, index>
Proper defaults (i.e. identity permutation, index types for overhead storage) would allow shorter forms in most cases.
sparse_tensor<32x16xf32, DxS>
Perhaps we could even drop the sparse_prefix, and the sparsity by apparent from the presence of S after the shape. The sparse/dense annotation could even be incorporated into the shape, as originally proposed by John Jolly, with dense the default, to get an even more concise syntax.
tensor<32x16Sxf32>
Looking forward to a constructive discussion! If we can converge on a design, implementation will soon follow which in turn, will enable replacing some of the ad-hoc glue currently present in the MLIR sparse compiler prototype. Hopefully the new support will be useful to the MLIR community at-large.