[RFC] Support scalable dimensions in ShapedTypes

Motivation / The problems with VectorType

Currently, ShapedTypes can have dynamic dimensions or fixed/static dimensions. This leads to an awkward situation for VectorType. It is a ShapedType, but a large part of its shape, whether a dimension is scalable, is stored in a separate array of booleans unknown to ShapedType.

This makes the VectorType somewhat tricky to work with. Whenever manipulating a shape, you have to remember to make the same change to the scalable dims. It is easy to accidentally lose your scalable dims, or end up with the wrong dims marked scalable. You also have to keep in mind the value of a dim has a different meaning depending on the corresponding scalable flag. vectorType.getDim(0) == 1 is only a unit dim if !vectorType.getScalableDims()[0].

Proposal

This RFC proposes to remove the scalableDims array of bools. Instead, scalability will be represented at the ShapedType level by using negative values for the dimension sizes.

Current Proposal
kDynamic std::numeric_limits<int64_t>::min() std::numeric_limits<int64_t>::min()
Negative values (other than kDynamic) Invalid Scalable dimensions
Positive values (including zero) Fixed/static dimensions Fixed/static dimensions

Doing this while still treating dimensions as int64_t could prove error-prone, so a new class mlir::ShapeDim would be added to represent dimensions. This class just needs to wrap a single int64_t, so is no change to the sizeof() a ShapedType.

The mlir::ShapeDim class has methods to construct dims, and query if a dim is scalable, fixed, or dynamic, and safe ways to get the size (or min size) of a dimension. See the implementation below. This is similar to llvm::ElementCount, which is used to represent fixed or scalable vector quantities within LLVM. The main differences are the addition of the dynamic state, and that is it possible to round-trip ShapeDimint64_tShapeDim without losing your scalable dimensions, which helps keep the initial change relatively small.

There is a lot of code that uses int64_t for dimensions, so currently mlir::ShapeDim can be implicitly converted to/from int64_t. It is hoped that code can transition to using ShapeDim and eventually the implicit conversions can be removed. This path is similar to llvm::ElementCount where implicit conversions are allowed unless compiled with STRICT_FIXED_SIZE_VECTORS, at which point implicit conversions become an error.

How does the affect existing ShapedTypes?

  • No change in size (a dim is still just backed by an int64_t)
  • If you don’t need scalable dims, things mostly stay the same, but there are
    some extra assertions

Some code that never expects to work for scalable sizes:

// Note: This still works for the time being.
int64_t outputH = resultTy.getDimSize(0);
if (outputH == ShapedType::kDynamic) {
  // ...
} else {
  for (int64_t d = 0; d < outputH; d++) {
    // ...
  }
}

Would become:

ShapeDim outputH = resultTy.getDim(0);
if (outputH.isDynamic()) {
  // ...
} else {
  // Will assert `outputH` has a fixed size.
  for (int64_t d = 0; d < outputH.fixedSize(); d++) {
    // ...
  }
}

Proof-of-concept implementation

Here: [mlir][ProofOfConcept] Support scalable dimensions in ShapedTypes by MacDue · Pull Request #65341 · llvm/llvm-project · GitHub


Conclusion

Thanks for taking a look! We’d appreciate any feedback, or being told we’ve missed something obvious :). Note that this is being done as part of a larger effort to enable scalable vectorization (such as [RFC] Scalable Vectorisation in Linalg). This particular change is an attempt to alleviate some pain points we’ve noticed while doing that work.

mlir::ShapeDim class

struct ShapeDim {
  /// Deprecated. Use ShapeDim::fixed(), ShapeDim::scalable(), or
  /// ShapeDim::kDynamic() instead.
  constexpr ShapeDim(int64_t size) : size(size) {}

  /// Deprecated. Use an explicit conversion instead.
  constexpr operator int64_t() const { return size; }

  /// Construct a scalable dimension.
  constexpr static ShapeDim scalable(int64_t size) {
    assert(size > 0);
    return ShapeDim{-size};
  }

  /// Construct a fixed dimension.
  constexpr static ShapeDim fixed(int64_t size) {
    assert(size >= 0);
    return ShapeDim{size};
  }

  /// Construct a dynamic dimension.
  constexpr static ShapeDim kDynamic() { return ShapeDim{}; }

  /// Returns whether this is a dynamic dimension.
  constexpr bool isDynamic() const { return size == kDynamic(); }

  /// Returns whether this is a scalable dimension.
  constexpr bool isScalable() const { return size < 0 && !isDynamic(); }

  /// Returns whether this is a fixed dimension.
  constexpr bool isFixed() const { return size >= 0; }

  /// Asserts a dimension is fixed and returns its size.
  constexpr int64_t fixedSize() const {
    assert(isFixed());
    return size;
  };

  /// Asserts a dimension is scalable and returns its size.
  constexpr int64_t scalableSize() const {
    assert(isScalable());
    return -size;
  }

  /// Returns the minimum (runtime) size for this dimension.
  constexpr int64_t minSize() const {
    if (isScalable())
      return scalableSize();
    if (isFixed())
      return fixedSize();
    return 0;
  }

private:
  constexpr explicit ShapeDim()
      : ShapeDim(std::numeric_limits<int64_t>::min()) {}

  int64_t size;
};
4 Likes

CC @javiersetoain, @nicolasvasilache, @ThomasRaoux, @MaheshRavishankar, @aartbik, @zhanghb97

It would be useful to have the type system differentiate between scalable dim and static ones, in the same fashion as we do for RankedTensorType and unranked ones (or MemRef).

That would make the APIs more clear and safe in general, while avoiding redundant checks for scalable dimensions everywhere.
A type that does not support scalable dimensions (like Tensor for example) should be able to return only static dimensions.

 // Will assert `outputH` has a fixed size.
  for (int64_t d = 0; d < outputH.fixedSize(); d++) {
    // ...
  }

I would rather assert if fixedSize is called without checking for isDynamic here (like llvm::Error is setup): we should catch API misuses even when the runtime value wouldn’t trigger the error.

1 Like

I’m also not convinced by this part of the plan: without an explicit roadmap with aggressive conversion and depreciation of the implicit conversion, it’s not clear to me that this will pan out.

2 Likes

Conceptually, “scalable sizes” fit better with VectorType. If you move them to ShapedType, it gives the impression that “scalable sizes” is a property that any shaped type can/may have, but VectorType is actually the only such type.

This makes the VectorType somewhat tricky to work with. Whenever manipulating a shape, you have to remember to make the same change to the scalable dims. It is easy to accidentally lose your scalable dims, or end up with the wrong dims marked scalable.

How can this happen? Vector types are created through a VectorType constructor (VectorType::get(...)). ShapedType is just an interface, and there is no ShapedType::get(...). Maybe the real problem is that we use ShapedType in places where VectorType should be used? (E.g., helper functions of vector ops returning a ShapedType when it actually could be a VectorType.)

2 Likes

I agree with the sentiment here. This looks more like API misuse (ShapeType instead of VectorType) than a need to carry that information above vectors.

1 Like

Conceptually, “scalable sizes” fit better with VectorType . If you move them to ShapedType , it gives the impression that “scalable sizes” is a property that any shaped type can/may have, but VectorType is actually the only such type.

Though not done here, scalable sizes are pretty much a dynamic size with some more constraints. They could still be interpreted by other ShapedTypes, for example the shape for vector<2x[4]xf32> when applied unchanged to a memref be treated as memref<2x?xf32>.


How can this happen? Vector types are created through a VectorType constructor (VectorType::get(...) ). ShapedType is just an interface, and there is no ShapedType::get(...) . Maybe the real problem is that we use ShapedType in places where VectorType should be used? (E.g., helper functions of vector ops returning a ShapedType when it actually could be a VectorType .)

The issue I’m trying to get at is the size/element count of a dimension is currently split across two places which makes things harder to work with than they otherwise could be.

To take one example, trimming the leading 1 dims of a vector type.
Currently, it requires maintaining two lists and carefully checking scalability:

static VectorType trimLeadingOneDims(VectorType oldType) {
  ArrayRef<int64_t> oldShape = oldType.getShape();
  ArrayRef<int64_t> newShape = oldShape;

  ArrayRef<bool> oldScalableDims = oldType.getScalableDims();
  ArrayRef<bool> newScalableDims = oldScalableDims;

  while (!newShape.empty() && newShape.front() == 1 &&
         !newScalableDims.front()) {
    newShape = newShape.drop_front(1);
    newScalableDims = newScalableDims.drop_front(1);
  }

  // Make sure we have at least 1 dimension per vector type requirements.
  if (newShape.empty()) {
    newShape = oldShape.take_back();
    newScalableDims = oldType.getScalableDims().take_back();
  }
  return VectorType::get(newShape, oldType.getElementType(), newScalableDims);
}

If that information was part of the shape itself the same transformation is a lot simpler:

static VectorType trimLeadingOneDims(VectorType oldType) {
  ArrayRef<ShapeDim> oldShape = oldType.getShape();
  ArrayRef<ShapeDim> newShape =
      oldShape.drop_while([](ShapeDim dim) { return dim == 1; });

  // Make sure we have at least 1 dimension per vector type requirements.
  if (newShape.empty())
    newShape = oldShape.take_back();

  return VectorType::get(newShape, oldType.getElementType());
}

Note that the above code is almost identical to how trimLeadingOneDims() was before it was fixed up for scalable vectors.

Thank you so much for working on this! :pray:t2:

The changes proposed here would allow us to remove a great deal of footguns in the context of scalable vectors. This would be really good for the overall health of the project.

However, this change would mean 2 things:

  1. Great improvement from the VectorType point of view (I don’t see any downsides, great!),
  2. Other types deriving from ShapedType would also be allowed to have “scalable” sizes - Is this desirable?

Looks like folks are quite unsure about 2 (quite understandable - it would be a huge shift).

One possible alternative would be to disconnect VectorType from ShapedType. Given that ShapedType defines kDynamic and VectorType does not use it, that would also make sense to me. What do people think?

-Andrzej

2 Likes

For those if us who have not been using VectorType in this way, could you define “scalable dim”?

From context, I take it to mean that it is “dynamic” in the ShapedType sense but with a static maximum size?

If that’s the case, we’ve wanted that for tensor basically forever, and if we could harmonize the type system, that would be nice.

I’m -1 on leaking a “stored as negative” approach to the public API and think this could be represented better.

ForTensorType, I expect this would be a new BoundedTensorType or something, and that there would be new ShapedType methods like getDimUpperBound(i) that could vary when isDimDynamic(i) (or something like that).

(Edit: I expect the same reasoning applies to MemRefType but I have not thought through the implications as that type is more complicated)

Not exactly. The size is a fixed runtime multiple of a static size. For instance, a compile-time vector<[4]xi32> can be, at runtime, equivalent to a vector<1x4xi32>, or vector<2x4xi32>, or vector<4x4xi32>, and so on. The hardware implementation of the vector unit is what defines the multiplicity, and ISAs may define its constrains, like a maximum.

From context, I take it to mean that it is “dynamic” in the ShapedType sense but with a static maximum size?

For a ShapedType it’s the opposite, it has a static minimum size, which at runtime will be scaled by a constant (that’s unknown at compile time), typically called vscale.

I’m -1 on leaking a “stored as negative” approach to the public API and think this could be represented better.

Me too, it’s not something I’d like to keep around (or ever use directly). It’s just a means to allow incremental change. I want to switch to using APIs that return a mlir::ShapeDim, with the underlying storage abstracted away.

In my current proof of concept PR, the only direct access to the negative values is by indexing into .getShape()[idx]. The .getDimSize(idx) method has been updated to return the minimum size of a dimension (or kDynamic). But the preferred way to access a dimension is via a new .getDim(idx) method, which returns a mlir::ShapeDim, which you can then query.

Ah, thanks. That sounds pretty specific to the VectorType and may just be part of its API surface.

ShapedType can represent less information than any specific concrete type. There is no rule that says it has to model every detail so long as it represents a subset of the information.

1 Like

One possible alternative would be to disconnect VectorType from ShapedType . Given that ShapedType defines kDynamic and VectorType does not use it, that would also make sense to me. What do people think?

Makes sense to me, I think it only makes sense for something to implement the ShapedType interface if shapes of that type are representable, and that does not seem to be the case for VectorType:

Supports kDynamic All shapes representable as ShapedTypes? Supports scalabe dimensions?
MemRef Yes Yes No
RankedTensor Yes Yes No
VectorType No No (no scalable dims) Yes
UnrankedMemRef N/A N/A (no shape) N/A
UnrankedTensor N/A N/A (no shape) N/A
1 Like

This makes sense to me.

Just a note of caution, ShapedType is used quite a lot (and I’m not sure what would be the consequences of removing it from VectorType). I tried something similar a while ago (splitting ShapedType into UnrankedShapedType and RankedShapedType; [RFC] ShapedType Type Interface Cleanup) and quickly ran into issues because so much code depends on the current API. (That change still has not landed.) It may not be as difficult for VectorType though…

1 Like

I’d like to argue for having scalable dimensions in memref and tensor, and I’ve had that “stored as negative” idea bouncing around our kernel generator’s todo list internally.

Specifically, the reason you want scalable tensor dimensions is being able to express things like “this dimension is of an unknown size that’s divisible by 4” or “this dimension’s a multiple of 32” which allows you to, for example, determine how best to vectorize loads/stores or to determine if you need to actually emit code to pad tiles out to an internal block size.

Notice that a scalable dimension in a vector is a stronger constraint than “it’s an unknown multiple N”, it means “it’s a hardware-defined multiple of N”.

An important implication of this difference, for instance, is that the scale factor (vscale) is constant, and the same for all vectors throughout the lifetime of the program. Also, the runtime size of the vector can be computed at runtime, always. It’s not a static value, but it’s a fixed, well-defined value. I don’t think that’s what you want for your memrefs.

What you propose would probably fit better as a memref attribute, not unlike alignment, offsets, etc.

1 Like

@nicolasvasilache and I have been intermittently discussing this for years. This is a useful thing to have, but more as a generic “property” of the memref/tensor type, potentially as part of the layout attribute. Divisibility is not the only piece of information that we may want to encode, we may also want constant bounds (runtime size unknonwn, but is guaranteed to be greater/less than X) and relations between dimensions (runtime size unknown, but guaranteed to be square, aka NxN rather than ?x?; or even Nx(2*N-3) kind of shapes that appear in dilated convolutions).

2 Likes

I agree that a more general constraints attribute could be useful, though that might be an independent discussion from cleaning up the ShapedType wart?

And I don’t want to see this restricted to memref - tensors should be able to carry those constraints as well.

Introducing such dimension constraints, in ShapedType or elsewhere, would indeed justify a separate RFC. However, propagating the idea of scalability and abusing it to represent a subset of what seems to be actually desired (i.e. constraints) sounds like the opposite of cleaning up ShapedType to me.

2 Likes

Please, don’t close the dynamic vector dimension door just yet :slight_smile:. Dynamic vector dimensions have been discussed in the past and we are currently working on a proposal for RISC-V dynamic vectors where vector<?xtype> is one of the representation options :slight_smile:

(Sorry, I haven’t had enough time to dig into all the details of this discussion yet)