Hi all,
This is a high level enquiry about linalg/mlir features availability. We are trying to generate BLAS level3 functions like strmm (triangular matrix multiply).
The few high level questions I have are:
 Has something like this been already tried?
 Do we have support in linalg/mlir for “triangular” loops (i.e., where we don’t consider all the matrix)? By support I mean code generation + optimizations.
 Do we have support in linalg/mlir for “triangular” packed storage (i.e., instead of storing all the matrix, we store only half of it and use an affine map to access the elements)?
Thanks a lot in advance,
Giuseppe
There is nothing concrete yet; atm we have quite restricted dense hyperrectangular types and much more general sparse types. Triangular and ragged are part of our wishlist but no concrete effort has started yet.
Contributions are most welcome !
Hi Nicolas,
I was hoping in your reply Yes, we would like to contribute in this sense. Could you help me build an high level view of what needs to be done?
For instance, from the storage point of view, do we need to introduce properties to tensors? Or maybe create a rectangular tensor? How would this map into memrefs?
From the computation point of view, should we create a new linalg operation? E.g., linalg.structured
(as opposed to linalg.generic
)?
Thanks for any direction/help,
Giuseppe
Hi @nicolasvasilache , all,
following up on this, I am in the very early stages and trying to draft a small plan, starting with trmm
.
From here it looks like we can reuse many things in linalg (they basically subdivide the triangular matrices in small rectangular blocks and take care of the iterations end points).
Few warmup questions:

I am trying to add a linalg.triangular
operation (as opposed to linalg.generic
). Is this the right way? Or should I add an attribute to linalg.generic
to indicate that those are triangular matrices?

My first goal would be to have a convertlinalgtoloops
that emits right loop nest in the triangular case. I tentatively added a createLoopRangesTriangular
to the LinalgStructuredInterface
to hack my way in. Is that ok? Or should I add a new Triangular
interface which maybe inherits from LinalgStructuredInterface
(or any other ideas)?
Thanks,
Giuseppe
I am out until May 3rd so please expect low to zero bandwidth from me.
The step forward here IMO would be to add support for a new triangular shaped type.
Attributes could be experimented with to get the first pieces connected but shouldn’t land by themselves as they are quite unlikely to connect all pieces properly.
New ops for this specific purpose are a nonstarter IMO but may be useful for quick early prototyping.
In this context, I played with attributes similar to what is available in the sparse dialect with the sparseTensorEncoding. For example, the pseudo IR below multiples a lower triangular matrix with a general one.
%0 = my_dialect.alloc [10, 10] : !my_dialect.matrix<#matrix.property<["lowerTri"]>, [10, 10], f32>
%1 = my_dialect.fill(%0, %constantX) : !my_dialect.matrix<#matrix.property<["lowerTri"]>, [10, 10], f32>, f32 > !my_dialect.matrix<#matrix.property<["lowerTri"]>, [10, 10], f32>
%2 = my_dialect.alloc [10, 10] : !my_dialect.matrix<#matrix.property<["general"]>[10, 10], f32>
%3 = my_dialect.fill(%2, %constantY) : !my_dialect.matrix<#matrix.property<["general"]>[10, 10], f32>, f32 > !my_dialect.matrix<#matrix.property<["general"]>[10, 10], f32>
%4 = my_dialect.mul %1, %3 : !my_dialect.matrix<#matrix.property<["lowerTri"]>, [10, 10], f32>, !my_dialect.matrix<#matrix.property<["general"]>[10, 10], f32> > !my_dialect.matrix<#matrix.property<["general"]>[10, 10], f32>
Now you can lower mul
directly to a triangular loop or convert it to a linalg.generic and customize the body of the operation by still having a rectangular iteration domain but guarding the computation in the body with a conditional. I think this is the easiest way to get started but using attributes makes it difficult to connect with linalg properly. Having a new triangular shape type would be nice.