[RFC] Linalg OpDSL constant list attribute definition?

@stellaraccident @ftynse

When trying to remove the transpose_(a|b) from matmul and batch_matmul, I have hit a snag on the indexing. I really don’t want to create a whole new DSL just for this case, I think we can still get it there and remove those ops (by replacing them with identical representations from their parent ops) before we do that.

Problem description

So, here’s the rub: TensorDef creates dims as named params, and IndexAttrDef creates indexes as named params, but they do not represent the actual value even when constant (kinda like SSA), because what that DSL function is doing is building the function, so it needs to be described as a computation.

This is an example of what I’m trying.

def matmul(...
           map_a=IndexAttrDef(S.A0, S.A1, default=[0, 1]),
           ...):
  ...
  # tuple indices must be integers or slices, not SymbolDef
  dims_a = (D.m, D.k)
  l_a = (dims_a[S.A0], dims_a[S.A1])
  ...

I can’t use something like int(S.A0) because that’s just the name of the dimension (ie. literal "Symbol(A0)"), I need to create an expression that extracts a tuple index from a symbol definition.

Solution idea: Constant list attribute

I’m not well versed in OpDSL (or Python much), but the actual code isn’t that extensive. This should be something like ScalarDef but as a list, and the generator should be able to interpret them as (generate code for) picking a different SymbolDef out of the tuple and update the equation accordingly.

I’m trying to avoid having to change the code that transforms symbol naming into affine maps, so get into symbols (in whatever order), which would be identical to the current implementation of the transposed versions today.

Alternative 1: Permute function type

I tried implementing a TransformFn.permute that can be used as:

C[D.m, D.n] += cast(U, permute(A[D.m, D.k], perm_a) * cast(U, permute(B[D.k, D.m], perm_b))

But that leads to having to implement the function itself, which I don’t know how. Perhaps this would be simpler, but I’m a bit out of my depth here.

Alternative 2: More named ops

We already have matmul_transpose_(a|b) and batch_matmul_transpose_(a|b), I can add more variations for batch_reduce_matmul_transpose_(a|b), but there are more than 2 here, I don’t want to do this.

Any better ideas?

Alternative 3: write it in ODS and c++, using the generated code as a starting point and simplifying.

I don’t have the mental capacity to put into trying to make opdsl more than it is right now. If someone else does manage it, I’d be happy to review. But even suggesting an approach requires pulling this back into active cache and is more than I can do for the next month or two.

1 Like

If I implement matmul and friends in ODS/C++, I wouldn’t be able to keep this and opDSL representations in parallel. Wouldn’t this fall into the category of replacing without backup?

Don’t get me wrong, I’m ok with it, just thought this was what you were warning against on the other thread.

Is opdsl providing anything, to anyone, other than being a rather opinionated way to generate ods and backing c++ op implementations?

I am aware (because I wrote it) that there is some theoretical support for it to also be used from python for on the fly IR generation, but I am not aware of any public projects who have relied on that – although I am aware of several who tried and found it not a good fit/went another way.

This is a legitimate question, not a challenge: I’ve been watching for years and think this use case never materialized. But if it did, we should discuss.

As a counter point, we’ve been developing linalg named ops of various kinds for years, just using ODS and c++ and that has worked out fine. We’ve further used those from python with torch op wrappers that, if there were ever to be auto generated, would need a lot more out of the language than what opdsl provides.

I think we should assume that there are not impactful (or possibly any) live uses of the non ODS generation features of opdsl, and if there are, folks will need to speak up. There are already plenty of ops not modeled by it, so it seems unlikely that this is a strict requirement.

Underlying all of that is the principle: we need to get the representation and c++ API for these ops right, and if it is prohibitive to do that with opdsl, then it shouldn’t be used.

Agree.

I think this is the decision we should make: remove the _transpose_(a|b) variants or keep generating them. If we do remove them, we should agree on the way of encoding that in C++, then ODS and eventually making OpDSL emit that if desired.

On the python side, we can do a bunch of syntactic fun if desired. E.g.,

def permute(*args, permutation):
  return tuple([args[permutation[i]] for i in permutation])

# ...

C[D.m, D.n] = cast(U, A[*permute(D.m, D.k, permutation=[1,0] if transpose_a else [0, 1])],
                      B[*permute(D.k, D.n, permutation=[1,0] if transpose_b else [0,1])])

but this is based on the premise of having multiple named ops.

This matches my understanding when trying to do that earlier. I guess ODS is the only way, then.

The way I see this working is to re-implement matmul and friends in table-gen and then Indiana Jones replace the current one, and if all goes well, the tests should all pass. Modulo tests being comprehensive, of course.

I’d be much more comfortable if downstream folks could beta-test the branch/commit/PR before we decide to merge.

We could do a permutation of that where we create the new one as matmul_new or something (then rename) but yeah.

We can downstream test it in torch-mlir and iree, np.

I wouldn’t want to have that upstream, tbh. Plus, not sure how many places we’ll need to hack to replace matmul with matmul_new, it could be non-trivial.

Maybe just an upstream branch where we just build against instead of main. Rebase frequently, once good, PR & merge.

To be clear, I thought the sequence should be:

  • New branch with the exact same ops and syntax/semantics, but in ODS.
  • Using op alias instead of whole new code for each
  • Code using it should be identical
  • Once good we PR & merge
  • After upstream merge, we work downstream to replace with op + attribute
  • Remove aliases
1 Like

One caveat with substitution is the unknown amount of code, potentially downstream, that assumes MatmulOp is specifically the untransposed matmul without any further checks. Having a new op with a new name forces a systematic inspection as relevant places just fail to compile.