Better layering for `InferShapedTypeOpInterface`

The interface is meant to allow for resolving the dimension of the result of an operation in terms of its operands. For example consider,

%0 = <some op that implements InferShapedTypeOpInterface>
%1 = memref/tensor.dim %0, ...

It is beneficial to resolve %1 in terms of operands of the operation that creates %0. This has two benefits

  1. If the operation that creates %0 has no other uses, the operation becomes dead and can be DCE-ed
  2. If you can resolve value of %1 in terms of operands of the first operation, and if those themselves are created by operations that implement the InferShapedTypeOpInterface, you could further resolve those. Using a fixed point iteration, you can them resolve the shapes of the program (assuming every op implements this interface)

Currently the InferShapedTypeOpInterface implements three methods

  1. InferReturnTypeComponents : This method is used to get information of the return values when they are shaped type. This is meant to get the return type during operation creation.
  2. reifyResultTypeShapes : This method is used to resolve dim of result in terms of dim of operands, and is also meant to target unranked types. In particular it returns (by reference) a list of Values of type tensor<? x index> (one for each result of the operation). The rank of the tensor is same as the rank of the result, and the i-th element of the tensor is the shape of the result at i-th dimenion.
  3. reifyResultTypeShapesPerResultDim : This method is used to resolve dim of result in terms of dim of operands but for ranked types. In particular it returns (by reference) a list of SmallVector<Value> (one for each result). The i-th element of the SmallVector is the shape of the result of the i-th dimension. This interface avoids creating (and later canonicalizing away) the operations to create a tensor<? x index> and then extracing the i-th element by just directly forwarding the SSA value.

These seems like it could use some redesign/refactoring. Particularly

  1. The InferReturnTypeComponents of this interface seems to be doing the same as InferTypeComponents. I dont have visiblity into what this method is for, but can this be folded into the InferTypeComponents interface?

  2. In any case the reifyResultTypeShapes and reifyResultTypeShapesPerResultDim are similar but solving slightly different uses. It is currently expected that an operation implements only one of these methods. The first one takes precedence over the second when the dim op is canonicalized using the ResolveShapedTypeResultDimsPass (here)

At this point I am more interested in the second one. Ideally these should be a separate interface since this is what is needed to resolve shapes using fixed point iterations (in the dynamic shape cases). In any case, there needs to be more synergy between the two methods. My proposal is

  1. The reifyResultTypeShapesPerResultDim has a default implementation that uses reifyResultTypeShapes to get the dimensions for each dim by using tensor.extract. The issue here is that this would add a explicit dependence of the interface on tensor dialect which has been a show stopper. Any suggestions here?
  2. The dim canonicalization (linked above) can be flipped to use just the reifyResultTypeShapesPerResultDim interface. With the default implementation above, it would give the same behavior.

I am happy to implement these things when all the stake holders agree. @jpienaar @herhut @nicolasvasilache @stellaraccident (and maybe even @benvanik and @River707 )

These are 3 separate interfaces in my mind.

  • reifyReturnTypeShapesPerResultDim is a dynamic transfer function for reifying IR that computes concrete shapes. By restricting itself to ranked types, it is able to take a very unopinionated representation of shapes (individual SSA index values per dimension) and so is suitable for extremely widespread use, especially across the codegen stack that is naturally already limited to ranked types.

  • reifyReturnTypeShapes implements a dynamic transfer function for computing concrete shapes as well. The use of tensor<?xindex> is basically an implementation detail of the shape dialect though, and limits the applicability to systems that buy into its premises. The use of builtin types to makes it seem like a more “universal” representation of shapes than it really is. This part of the interface should be moved to be specific to the shape dialect and kept separate from the rest. We might want an adaptor to implement it in terms of reifyReturnTypeShapesPerResultDim.

  • inferReturnTypeComponents is a generic static transfer function (i.e. for calculating static types, such as tensor<?x4x?xf32>). It has a signature suitable for use in builders for inferring result types (via the InferTensorType trait). It also has the element type and an opaque “attribute” mixed in, and I think it was meant for a more generic shape+elementtype+layout/etc. inference process, but that doesn’t seem to have materialized. It should be its own interface, and probably stripping it down appropriately to its current application and having it better integrated with InferTypeOpInterface would be nice.

1 Like

It need not, this just what folks have done who implemented this, but it isn’t a requirement. If a more exact rank is known it can be used. And indeed tensor isn’t even required, that is just convenience as it has been used elsewhere and has lowerings. But reifying just to shape.shape is as possible - now that may be a different interface too, but could be useful here in avoiding adding a dependency vs having this conversion as part of lowering.

I don’t see a InferTypeComponents. If you mean InferTypeOpInterface then no, they are different. Return type components may succeed while only returning partial results, while infer type may not. E.g., one can return only shapes or only element types without it failing, while these would not enable producing a type and so would have to fail in type inference.

This is where the above, returning a shape.shape, could be used instead and then there is shape.get_extent and both of these can be lowered away. The lowering from get_extent is pretty simple and one can decide which format one wants to use.

Care to clarify sneakiness here? Index type is meant for indexing and used for bounds from the start, this just adopts that convention. I’m not sure how a tensor of index is opinionated while an array of index is not.

Indeed, as adding attribute was discussed 2 years ago at ODM but only materialized this year :slight_smile: And most shape inference ended up happening on tensors with very little on already buffered format.

The issue is that this direction is less general, the way Mahesh has it set up results in the interface that requires fixed rank being able to use the interface that supports unranked. So the less general interface can use the more general one.

Is intended to describe a compile time static (as you say) shape function which doesn’t depend on the container format. Intention was for operations where both the memref shape and tensor shape are the same but only the container differs (e.g., MHLO and LMHLO). As mentioned it allows for smaller queries than the type interface. And indeed I want to expand it to not require materialized values or types so that it can be used in more general settings.

But yes these need to be revisited. And preliminary discussions around this are happening, but only very preliminary at the moment (esp I want to contrast with what ONNX folks have done but haven’t gotten to it yet unfortunately)

Sorry, the use of “sneaky” there was uncalled for. I edited the reply to remove it.

This phrasing also struck me as not great and I see it has been edited out now.

I think this statement is the crux of the problem – and hints at how we’ve coupled multiple concerns. The lower levels of the stack want/need tightly scoped facilities that are not overly general and open to interpretation. Most of the ops at that level are affine-based transfer functions and exposing that structure directly so it can be operated on is what naturally falls out. We’re mixing high and low level concepts here, which is creating the design friction, imo.

I’d love to see further investment and iteration on the existing shape work find its footing in the frontend world, where ime, more type polymorphism is usually appropriate. I don’t think it’s there yet and we should probably be cautious about mixing all of these levels into one thing.

I believe that any IR construct (“array” or “tensor” or whatever) that represents a dynamically-sized collection of index values that represents a shape is going to be opinionated in some way. Thankfully we can dodge that in the ranked case.

I’ll define “opinionated” as “not everybody agrees that they want shapes represented this way”. And in IREE I know for a fact that we worked really hard to remove the use of tensor<xindex>. And in PyTorch we don’t use tensor<xindex> at all (we have our own list type which is natively used for that). I’m actually not aware of a system that wants tensor<xindex> as a fundamental shape representation, except systems that for whatever reason historically only supported the “tensor” type as the only possible top-level type in their type systems (i.e. couldn’t represent a dedicated “shape” or “list” concept with any type other than a tensor), which includes MLIR core’s “runtime” story which handles this via bufferization, but I suspect if it had a more flexible runtime system would represent it differently.

I don’t think that makes sense from a layering perspective. That would be like implementing libjpeg in terms of a more general library that can do jpeg and png compression – it doesn’t make sense to have libjpeg in the first place in that case because users could just directly use the more general library. The whole point of having libjpeg in the first place is a tightly scoped component that can be assembled into the more general thing, or used in isolation without bringing in the more general thing. That’s why I suggested an adaptor for converting reifyReturnTypeShapesPerResultDim into reifyReturnTypeShapes, rather than the other way around.

The lowering is not trivial in the presence of control flow. This is a problem when that lowering is required for correctness (such as IREE), as we cannot robustly guarantee that all the !shape.shape/tensor<?xindex> disappear. When we are coming directly from a ranked representation already, introducing the !shape.shape/tensor<?xindex> values in the first place just artificially obscures the underlying ranked SSA-index-value-per-dimension representation that we want to ultimately arrive at, and that we could already start with if we directly used reifyReturnTypeShapesPerResultDim.

Ok, well that makes sense (I did mean InferTypeOpInterface). I am happy to leave it aside. I think in terms of layering InferReturnTypeComponents can then be made a separate interface and the reify* methods a separate interface? One is for construction and other is for shape resolution using fixed point iteration (I am fine leaving it as is as well)

I think this maybe unrelated. Using tensor<? x index> or shape.* are equivalent IMO. The current plan I put above of layering the two refiy* methods will not use either if it doesn’t have to. If there is an op that implements the reifyResultTypeShapePerResultDim then that will get higher preference compared to the general reifyResultTypeShape method. As a side discussions, this goes back to what Stella was saying above. Maybe higher up the stack using shape.* is useful, but lower down in codegen stack when you are probably only dealing with ranked shapes, using shape.* ops is not great. It interferes with optimizations. Indeed in IREE once we get into Linalg on tensors, the ops themselves carry all the shape information. There is no need for additional side-channel shape information AFAICS. That makes compilation for dynamic shapes work without any additional infrastructure. You need some shape information at the boundaries, but thats it. You still need the interface to resolve the shape information though (potentially using just reifyResultTypeShapePerResultDim). The layering above does that. You pay only for the generality you need. So if you are not dealing with unranked codegen, you dont pay the cost for supporting unranked codegen.

Not clear if you are agreeing with the layering in the original post or not

+1 . I cant think of a way to do efficient codegen in the unranked case (except for simple elementwise ops cases). For something sufficiently complex, you need ranked types. So support for unranked cases will multi-version at some point into ranked versions + fallback.

Maybe it isnt clear, thats what I am suggesting as the “default implementation”. If an operation overrides this method it will not use any of the shape.* or tensor<? x index> (unless the operation implementation itself does that which would be highly undesirable). Or is what you are suggesting a way to get around the issue of having a dependence on tensor dialect?

I don’t think it makes sense for a call to reifyReturnTypeShapesPerResultDim to ever call into reifyReturnTypeShapes. I view reifyReturnTypeShapesPerResultDim as a low-level primitive and reifyReturnTypeShapes a higher level thing. They are just two fundamentally disjoint interfaces at different abstraction levels. Neither should know about the other intrinsically, though we can have an adaptor that implements the higher-level interface in terms of the lower-level interface. (think of the analogy of reifyReturnTypeShapesPerResultDim==libjpeg and reifyReturnTypeShapes is a general image compression library supporting multiple formats)

Its more of a fallback. I agree it would be imperative that reifyReturnTypeShapesPerResultDim never calls into reifyReturnTypeShapes. You could achieve that by having all ops that codegen is interested in (at least what we know off in IREE, and others) override this method. But if an op does not implement this, and implements reifyReturnTypeShapes instead, then you fall back to that.
(I am not sure how to place your analogy though, it make sense on its own, but I think it is not the same situation here).

It seems really fragile to have a fallback that can break the invariants you are trying to maintain. Also, as Jacques mentioned, reifyReturnTypeShapes can return a !shape.shape type. So the fallback would need to have a dependency on the shape dialect for the shape.get_extent op, which seems undesirable.

It’s a moot point anyway. I don’t think there’s any op that would even hit this “fallback”. So let’s just not have the fallback, and see if a real need comes up and discuss it on its merits at that point in time. I think the key point is that reifyReturnTypeShapes needs to live in the shape dialect since it is intrinsically linked to that dialect’s specific representation of unranked shapes and types.

Ok. I am fine with this. If this is agreed upon, then there isn’t much to do apart from make the dim canonicalization use only reifyReturnTypeShapesPerResultDim instead of both. (also I want a better name for reifyReturnTypeShapesPerResultDim, my hands hurt typing all of that now)

I was mostly objecting to reifyReturnTypeShapesPerResultDim itself falling back to reifyReturnTypeShapes. I think it’s fine if the patterns in ResolveShapedTypeResultDimsPass attempt to use both interfaces (though probably a separation even there is worthwhile at some point IMO). It’s just that code that elects to only use ReifyReturnTypeShapesPerResultDimOpInterface (or whatever we rename it to) should not need to depend on the shape dialect.

Ok, then there is really less things to do. You are suggesting flipping the order in which dim canonicalization goes about this. That makes sense, I will flip it.

But then we are talking about two different interfaces. I am happy to at least start with moving reifyReturnTypeShapesPerDim and reifyReturnTypeShapes into their own interfaces.

So currently I see two options

  1. Split up InferShapedTypeOpInterface into three different interfaces as Sean mentioned in this comment, or
  2. Implement a default implementation of reifyReturnTypeShapesPerDim that calls into reifyReturnTypeShapes.

Either way the current setup of having an op implement only one of the methods of the refiy* is strange and I want to fix it. So tally so far

Option 1 : 1 (Sean)
Option 2 : 0
(Option 1) xor (Option 2) : 1 (me)
Do nothing: 0

The design of reifyReturnTypeShapes carries a fair amount of history and the use of tensor<?xindex> to some degree is the consequence of tensor being available at the time. I would not say that it is any more opinionated than using scalar indices of index type or some other special type one can come up with. In the end, it is only important that one type is consistently used to model shapes. I understand that tensor might be inconvenient if the system uses types as the primary distinction between different kinds of computations. Then again, index is the type designated for index computations.

Regarding what to do with the two interfaces: I agree that reifyReturnTypeShapes is better suitable at a higher level of the stack, where it is convenient to have shapes as single values. At the level of code generation, reifyReturnTypeShapesPerDim comes in handy. So we need both. Splitting the interface into 3 comes at the risk of implementations diverging but I still believe it is the best option. For some operations that are more at the codegen level (like linalg) implementing the ranked interface might be good enough, as linalg is never used in a higher-level modelling. Other dialects, like mhlo might care less about the ranked variant.

If we split them, we could have traits to mix in a default implementation of one interface using the other where this seems useful. For e.g. linalg, implementing the unranked interface requires constructing a tensor anyway, so the end-result would be the same.

For the question which interface the dim rewrite should use: In that particular case, if the queried dimension is a constant, using the ranked interface is the more natural choice. If the dimension is not constant, one would need to use the unranked interface anyway. So it is not all that black-and-white.

For a canonicalization, no extra shape IR should be created, so using the dynamic interface is only warranted if the op already has an operand that represents the shape vector (like a mhlo.dynamic_reshape). For a rewrite pass, one should be able to choose.

Ok, to begin with, any objections to splitting the reifyReturnTypeShapesPerDim into a separate interface called InferRankedShapeTypeInferenceOp ? I would assume that all codegen dialects then need to use only this interface?

Will wait for more opinions for a couple of days before going down this path.

Sounds good. Something like ReifyRankedShapeTypeShapeOpInterface (because it reifies the shape of a ranked shape type) would be more descriptive of what it does IMHO.

But let’s not block this on bike shedding a name. Feel free to pick a different one.

[sorry for delayed response, took some time to get back on this given other in flight requests]

SG and yeah one is for building, two for reify, one for fixed point seems about where we are.

This was more referring to the dependency on tensor, but yes as Sean mentioned for your needs this is orthogonal.

Indeed and I think also goes back to TCP vs TCF discussion.

Agreed, Stephan and I were talking about this yesterday too.

Naming, both seems fine, ReifyRankedShapeTypeShapeOpInterface is more concise and unless we decide that we want a ranked reify with both scalars and tensors, I don’t foresee it causing confusion.

1 Like

Implemented in