How to define a custom integer type in MLIR?

How to define a custom integer type in MLIR?

I want to encode DictAttr inside an integer type,
but the builtin integer type does not support it.

Requirement:

  1. Type::isInteger(unsigned width) const should return true for a matching width
  2. Type storage should contain integer width, and an Attribute object

Can anyone give an example?

Define it by tablegen is preferred.
If not possible, define it by C++ code only also ok

Thanks

I’m not sure what this means: do you want a type that describe an integer using a width and a DictAttr? You can easily define such a type, but it’s fully disconnected from the builtin IntegerType.

This isn’t possible, look at the implementation:

bool Type::isInteger(unsigned width) const {
  if (auto intTy = llvm::dyn_cast<IntegerType>(*this))
    return intTy.getWidth() == width;
  return false;
}

This only works for the builtin IntegerType.

Is it possible to derive from IntegerType ?

I am also looking for the answer to the same question. I thought to ask on this existing thread instead of creating a new one:

  • I created an integer type in my own dialect ā€˜foo’ (as described here

  • The question is how to tell the framework that ā€˜foo::IntegerType’ is really an integer type like ā€˜mlir::IntegerType’

  • Currently, the attribute parser complains when I try to move an integer constant to a variable of my integer type (i.e. foo::IntegerType) using a constant operation I added in the foo dialect. So, for the following mlir input:

    %c1 = foo.constant 5 : !foo.myint<3>

I get the following error:
error: integer literal not valid for specified type

Can someone please guide on how I can add my own integer-type such that the framework understands it like the built-in integer type?

Thanks!

You need to define your own foo::IntegerAttr to store an integer with your foo::IntegerType.

In addition to what @mehdi_amini said above, the general premise is as follows. If you want everything to treat a type as an IntegerType, the type must be the IntegerType. There are generalizations possible via interfaces, but none implemented so far. Out of curiosity, what kind of integer type do you have that is not covered by built-in types?

Not the one asking, but from a front-end POV I’ve always wanted signed and unsigned index types or removing si* and ui* (forcing users to use actual metadata to propagate signedness information).

Yeah, this might be another case for the ā€œwe should have a ScalarTypeInterfaceā€ folder

This part can be enforced by a module-level verifier.

What methods would it have? The question is what we are generalizing over…

We’re generalizing over IntegerType and FloatType .

The signature I’ve always had running around - though I’ve seem equivalent ones, is

ScalarTypeInterface {
  /// If this type has a consistent static bitwidth, return it. This allows the `.getScalarBitwidth()` method to work.
  std::optional<int64_t> getStaticBitwidth();
  /// If the bitwidth of this type is data layout dependent, look up that width 
  /// in the data layout and return it if it is found. Otherwise, fall back to
  /// getStaticBitwidth().
  std::optional<int64_t> getBitwidth(DataLayout& dl) {
    return getStaticBitwidth();
  }
}

This would include things like pointers, as well as any newtypes of integer (ex. ā€œI’ve got a weird float, and I don’t want to get it through APFloat - I just need to be able to pass it around to my custom intrinsics and load/store itā€).

It’s not clear to me why you can’t just cast/bitcast to i64 before using the load/store if these are not working on your type.

In general these kind of generalization have pretty limited uses, because while interfaces are useful to introspect information about an entity, the transformations are limited (how do you fold a binary arithmetic operation over a couple of custom integers?).

From where I’m standing, there’s often a desire for stronger type information in IRs and the ability to conjure up a newtype/wrapper struct/… around what might be a bag of bits post-lowering (or maybe it isn’t - maybe you’ve got a target that knows about !my_machine.experimental_8_bit_thing ).

And while you can’t really do arithmetic on these things … that’s partly also the point?

(To give an example, def char : TypeWrapper<$dialect, "char", I8> (or, perhaps I32) is a very sensible thin wrapper around a byte that represents a character, not an integer, and so shouldn’t be treated as an integer semantically. And that sort of type should work with memref.load, arith.bitcast, etc. … so that all the standard tooling knows what’s going on even if it doesn’t know what a char is, exactly.)

Bitcasts can also make program transformations more awkward - perhaps you want different handling for a tensor<… x i8> and a tensor<… x [my byte-length type]> at the function boundary.

I meant more like, if we have signed and unsigned integer types in MLIR then we should likely have signed and unsigned index types. With the alternative being only having signless versions of ints and index.

I personally would favor removal of si* and ui*, as most upstream operations only work on signless.

si*/ui* are here for convenience for downstream which may use ā€œfrontend-like semanticsā€, does it really hurt to keep them here?

does it really hurt to keep them here?

What I’m arguing for is consistency. We can keep those types, but then I think we should have sindex and uindex.

Also, two quick data points, both FIR and CIR have their own int types with signedness information: llvm-project/flang/include/flang/Optimizer/Dialect/FIRTypes.td at main Ā· llvm/llvm-project Ā· GitHub clangir/clang/include/clang/CIR/Dialect/IR/CIRTypes.td at main Ā· llvm/clangir Ā· GitHub

Maybe @bcardosolopes and @jeanPerier can comment on the decisions by those front-ends to have those types.

I don’t quite see why considering the history and the reason behind these existing: superficial consistency is not valuable here without clear use-cases.

That’s why I said I favor removal.

However, upon further checking they are being used by SPIR-V. I’d argue that SPIR-V should have it’s own int type, as they are the main user. But I won’t as I’m not going to spend time on that.

But you don’t have provided any argument in this direction: I asked ā€œdoes it really hurt to keep them here?ā€ and your answer was ā€œWhat I’m arguing for is consistencyā€.
Now that I’m saying this point is artificial and there is no ā€œconsistencyā€ to have here, you’re back to removal without answering the question: why?

Now that I’m saying this point is artificial and there is no ā€œconsistencyā€ to have here, you’re back to removal without answering the question: why?

True.

To me, the consistency argument applies to the front-end argument.

si*/ui* are here for convenience for downstream which may use ā€œfrontend-like semanticsā€

If si* and ui* are there for downstream front-ends, then, sindex and uindex should be there under the same logic, as it will be beneficial for downstreams.

Now, I recognize that would be adding dead churm to the codebase and is not something we want, which is why I favor removal.

To explicitly answer:

does it really hurt to keep them here?

No. But, I think we shouldn’t have something we don’t plan to support ops on. Maybe in a previous era, when extending the type system was more difficult the decision made sense, now it feels mostly historical.

That is a speculation on your side: we implemented it and deployed it for specific frontends where the need for signed/unsigned index hasn’t showed up. If someone demonstrate a strong need for these that would be a different story, but in the meantime it just seems ā€œout-of-thin-airā€ to me here.

Why? We decided differently when these were added, and you’re again not really making a strong case for this position.