[RFC] Inconsistency between dynamic and static attributes (i64 v. index)

Hi,

I’ve run into an inconsistency that is relatively prevalent in the whole MLIR code base and I’m looking for advices on how to fix that.

In a nutshell several operations use index for their dynamic attributes, but i64 for their static attributes, leading to (mixed) operands with more than one type!

I have a work-in-progress patch to fix that, see NF use index instead of i64 where it makes sense · llvm/llvm-project@07dd1f7 · GitHub (it only fixes the tensor and memref dialects for now), but I have mainly two problems:

  • I only discovered the places I missed to fix at runtime and a blind i64 to index is not the right approach. So essentially we need to audit our code to decide what is a legit i64 and what should be an index and I don’t think I can do this alone or go with “typeless” attributes (more on that below)
  • I don’t see a good way to stage the changes: it seems that we have to fix everything in one go to be able to change one dialect (because of the dependencies in lowering and whatnot.)

How do you think we should approach this inconsistency?

What is the problem?

Consider memref.reinterpret_cast:

def MemRef_ReinterpretCastOp
  : MemRef_OpWithOffsetSizesAndStrides<"reinterpret_cast", [
[...]
  let arguments = (ins Arg<AnyRankedOrUnrankedMemRef, "", []>:$source,
[...]
                       Variadic<Index>:$sizes, // <-- dynamic sizes
[...]
                       I64ArrayAttr:$static_sizes, // <-- static sizes

If you write something like memref.reinterpret_cast ... to ..., [%size0, 8, %size2] and then print the types of getMixedSizes(), you’ll get index, i64, index.
No surprise here, it’s consistent with the spec.

Now, let’s say we can prove that %size0 is a compile time constant of say 2.
What should we use for the type of 2?

Chances are we’ll want to do a RAUW and we’ll go with the type of %size0, i.e., index.

At this point, we may have in the IR several instances of 2 with different types but that represent the same thing (e.g., if another static size was initialized directly from the textual IR). Hence, unless the optimizers unbox the attributes, they would miss that they are actually equal.

The unboxing problem is something we know about in MLIR, nothing new here. What IMO is problematic in this instance is that we have an inconsistency of types within what is conceptually just one operand.

Typeless attributes

For some attributes we actually don’t care about their types or more precisely the type is not extensible so we could save ourselves the need to have it around. I.e., no need to store [{index, 3}, {index, 2}, {index, 1}] for a static size of [3, 2, 1], just the constants are enough.

@mehdi_amini actually started this clean-up by introducing DenseXXArrayAttr for exactly this kind of use case.

We could consider doing this clean-up directly as I believe this is the right long term approach. It’ll probably be more disruptive (or less incremental) though as it’ll likely require more changes in the C++ files.

What do people think?

Cheers,
-Quentin

CC: @nicolasvasilache, @ftynse

5 Likes

Thanks for raising the issue!

Linking @mehdi_amini 's post here for reference: [RFC] Introduce a new Dense Array attribute.

Very strong +1 to this cleanup in general through the codebase.
The different type for operand and attribute is a bug.

I am not 100% clear whether something special needs to happen to make IndexAttr work on non-64b-pointer targets, maybe others have a better sense here.

There have also been some issues with DenseXXXAttr:

  1. Some things don’t work together and create hard to diagnose bugs MLIR-ODS DenseXXXArrayAttr + ArrayMinCount seem like a footgun
  2. DenseXXXArrayAttr lacks features that XXXArrayAttr has and make it quite painful (e.g. see the discussion about Confined in ⚙ D130348 [mlir][tensor] Add gather/scatter op definitions to the tensor dialect.). The current suggestion is to use derived attributes but I have not yet spent enough effort to get one example to work.

The hoops one has to jump through here are a bit worrying; @mehdi_amini could we improve usability?

Lastly @qcolombet , do you see a way to slice out starter tasks from this?
This looks like a good opportunity to also onboard new folks after we take care of the key early cases.

1 Like

Yes, if we accept to fix the types inconsistency later.

The steps could be:

  • Convert the non-dense attributes to dense form one by one (assuming DenseXXXArrayAttr is the right abstraction otherwise we’ll have to decide what is that abstraction first.)
  • Switch the i64 type to index type where it makes sense.

The second bullet needs to be done in one go though. (Or more precisely be done in one go for each set of dialect that “work” together.)

I remember some of the original discussions on whether to have an index attribute at all, because index has unknown bitwidth while we need to select one to store the attribute. At some point, we just started assuming index is a 64-bit integer without actually handling the problem.

I’m +1 on cleaning this up a bit and using DenseXXXArrayAttr whenever possible. ArrayAttr is a misnomer and should have been TupleAttr given that it accepts attributes with different types.

In a longer-term perspective, I’d argue for typeless integer attributes. For things like sizes and offsets, the types comes from the semantics of the operation and the one in attributes just gets into the way. For all memref-related operations, we likely want index. For LLVM pointer operations, we want i64 or i32 depending on what LLVM’s spec says, but not index. Just wrap an APInt and let the operation interpret it. This is beyond the scope of what is being discussed here.

Sounds good, let’s start with adopting DenseXXXArrayAttr then.

Hi All, just to let you know that I am working on adopting the DenseXXXArrayAttr in the tensor dialect. Let me know if you are already working on that or want to handle the transition. Otherwise, I hope to have something working in a few days. Let’s not duplicate work.

2 Likes

Sounds good.
I haven’t started anything yet w.r.t. dense arrays.
Feel free to move forward with the tensor dialect.