Thanks Aart.
This does match the fact that Linalg is easily extensible to more advanced use-cases with extra levels of indirection and is a key direction to investigate.
I like the fact that we can turn the lights on with incremental steps, without immediately having to go deep into implementing the LLVM data structure abstraction (in particular, interactions with the vector dialect will be key to longer-term success). Instead we can take advantage of the interplay with the external library you started developing for reading from the matrix market.
That being said, even for the purpose of just turning the lights on, a few things need to happen. Here is a potential list that gets us to an end-to-end executable state:
- extend LinalgStructuredOpInterface to add support for this attribute and we still need a bunch of patterns to fail to apply if this attribute is present: I don’t think we can just ignore it, ever.
- if we start in the tensor world, we have the opportunity to control the whole end-to-end story + allocation ourselves. This is interesting because we can fake it for a while with a special linalg.sparse_alloc/sparse_dealloc op for which we can use a new pattern to convert to
std.call
and that just offloads to external C/C++ to do something useful.
- these ops should declare the proper interfaces and will play nicely with Linalg buffer allocation, that we will have to extend a tiny bit.
- we can connect the dots using rank-erased tensors and buffers for now. This will likely require a few minor extensions to some Linalg pieces to allow us to say: “special semantics, keep out” to the rest of the infra.
- underlying the rank-erased abstraction we can implement whatever format we want to start from directly in C++, or just link some existing sparse library.
- in a first prototype, we’ll likely need special linalg.sparse_load/linalg.sparse_store ops that take a rank-erased buffer + some extra information and call into external C/C++ to get the data and e.g. return some zero padding value if the sparse structure has no entry.
- -convert-linalg-to-loops should be extended to generate special control-flow and these new ops. This should be relatively straightforward for generic. The special control-flow I am thinking about is for the purpose of not needing newer
scf.for
/scf.while
to turn on the lights; it pseudo-resembles:
for %i, %j ...
%val = linalg.sparse_load(%A, %i, %j, %pad_val)
%cond = cmp "==", %val, %pad_val
if (%cond) {
...
store
}
- having the loop bounds properly derived in point
7.
may require a bit of genuflexions and a special linalg.sparse_dim
that again goes to library and gets the proper value. Pretty quickly, this may need to take outer loop induction variables.
Now the list above is but one possibility, other paths may also make sense so please feel free to adapt and farm out work. Still, I believe that being able to interop with external libraries + hide things from MLIR and then gradually expose more things to MLIR in a bottom-up fashion has many advantages. It certainly has proven beneficial in the past.
Once the lights are on we can start pulling in the sparse + TACO tricks, proper MLIR sparse type representation, add new scf.for
/scf.while
as needed to factor out control-flow overhead and connect to vectors. We can incrementally retire the things that are subsumed without breaking library interop.
I won’t start making a laundry list of things that will be involved there and that you are likely to post soon anyway
.
Re.
I’ll just pretend you wrote:
sparse_map = [
["D", "S"], // A
["D", "D"], // B
["D", "D"] // C
],
.
It’s just a matter of using ArrayAttr::getAsValueRange
to write our interfaces rather than iterating over StringRef and parsing it manually. Then the day we want something else, we can just change the type easily.
I’m really excited about these developments and looking forward to the transformation steps.
Thanks for pushing this angle!