Removing "kinds" from Attributes and Types

Hi all,

Over the next week the concept of “kinds” will be removed from attributes and types. This means that when constructing an attribute or type, you will no longer provide an “unsigned kind” value, the uniquer will instead rely on the TypeID of the derived Attribute or Type class. This greatly simplifies a lot of the underlying infrastructure surrounding Attribute/Type uniquing, but also removes the need for an external dialects to touch upstream MLIR when defining Attributes/Types. This means that after the refactoring has landed, a dialect author will no longer need to touch anything upstream.

How to avoid being broken

A lot of the non-breaking commits to support this have already landed in upstream, but a few will be landing towards the end of next week that completely remove the old behavior. This week should allow for external users to update their usages of Attributes/Types such that the breakage is contained and easily manageable. To limit the scope of breakages, the following steps should be taken:

  • Remove all MyType::kindof methods

The new casting functionality has already been added and is implicitly available, so there is no need to define these methods anymore.

  • Remove all usages of Type::getKind/Attribute::getKind

As part of the refactoring these methods will be removed. All usages should either be switched to using isa/dyn_cast directly, or use the llvm::TypeSwitch class. Note that if you are implementing the classof method for the base class of your dialect type hierarchy, you can use isa<MyDialect>(getDialect()) to check if the type belongs to MyDialect. Depending on the situation, these can result in more compact code than with a traditional switch. For those unfamiliar with TypeSwitch, this class provides functional switch using llvm style dynamic casting (sort of similarly to StringSwitch). An example is shown below:

// The following switch:
  switch (type.getKind()) {
  case ShapeTypes::Component:
    os << "component";
    return;
  case ShapeTypes::Element:
    os << "element";
    return;
  case ShapeTypes::Shape:
    os << "shape";
    return;
  case ShapeTypes::Size:
    os << "size";
    return;
  case ShapeTypes::ValueShape:
    os << "value_shape";
    return;
  case ShapeTypes::Witness:
    os << "witness";
    return;
  default:
    llvm_unreachable("unexpected 'shape' type kind");
  }

// would be transformed into:

  TypeSwitch<Type>(type)
      .Case<ComponentType>([&](Type) { os << "component"; })
      .Case<ElementType>([&](Type) { os << "element"; })
      .Case<ShapeType>([&](Type) { os << "shape"; })
      .Case<SizeType>([&](Type) { os << "size"; })
      .Case<ValueShapeType>([&](Type) { os << "value_shape"; })
      .Case<WitnessType>([&](Type) { os << "witness"; })
      .Default([](Type) { llvm_unreachable("unexpected 'shape' type kind"); });

If the above steps are taken, the amount of code that actually needs to change when the breaking commit is submitted should be minimal.

Thanks
– River

2 Likes

Awesome, this will be a great improvement, thanks River!

Have you ever looked at what machine code is generated for type switch? Is it basically reasonable?

Yeah, when I first added it I looked at the generated IR for a couple different scenarios. TypeSwitch just wraps around the LLVM dyn_cast mechanism(+ the class based version that MLIR uses), so the compiled code depends on how the class hierarchy implements classof. In the worst case it generates the same code as you would get from a nested set of if/elses(what most people write by hand if not using TypeSwitch). If the classof uses an integer kind, a majority of the cases I looked at compiled down to what you would get if you were using a traditional C++ switch(i.e. either the same as if/else or a jumptables if the kinds were compact enough).

Makes sense, thanks for checking!