[RFC] Remove Types

Implementation-wise, the only difference between types and attributes are that attributes all have a type. But most attribute types are NoneType. Removing types from attributes can save a lot of memory, but then types and attributes become conceptually and implementationally identical: a bunch of compile-time data. (Attributes that need types can add them as parameters, which can be generically queried with a TypedAttrInterface).

By making types and attributes identical, we can deduplicate and simplify a substantial amount of the MLIR core infrastructure (Type/AttributeSupport, Type/Attribute interfaces, AttrOrTypeDef, Python bindings, parsers and printers, etc.), the end goal being using Type = Attribute.

Conceptually, types become “attributes on SSA values.”

6 Likes

+1 for me.
I feel that a lot of code could be simplified by merging both, especially when trying to build systems on top of MLIR. For instance, in IRDL, most code is duplicated to handle both types and attributes.

However, what would the plan for parsing/printing? I feel that this would be a big breaking change, unless we allow to parse/print attributes both with ! and #?

1 Like

Step 1 would be to remove types from attributes, which itself would deduplicate a lot of code. But types and attributes would remain separate.

When (if) types get merged into attributes, types and attributes can be kept syntactically seperate, but the only issue would be overlapping mnemonics (!foo.bar and #foo.bar). But since mnemonics aren’t an inherent part of types and attributes (they’re just parser/printer artifacts), I don’t think it’s too much of a burden to rename one or the other if a collision occurs, which I anticipate there aren’t many.

I would look at the effect of this on the API and usability of the attributes we have today with types. The memory savings side of things isn’t as compelling given that we could optimize for that today without changing anything user facing.

Initial feelings from me are a very strong -1. What problem is this solving? If we still reason about them being completely separate concepts, what benefit is there to merging them (in contrast to the vast complexities this introduces from an API point-of-view)? I’m also not really sure I understand the claim of “simplify a substantial amount of the MLIR core infrastructure”, I don’t recall there being such a significant amount of infrastructure burden that unifying is appealing.

I feel like there is potentially a lot of context/motivation missing here. Can you elaborate more?

– River

2 Likes

This is a quite interesting proposal. Philosophically, I would rather remove attributes, not types. Types and type systems are well-established concepts in programming languages, attributes are more of a compiler artifact. Attributes are like type information about the operation itself, rather than about the values its produces (we have a tendency to refer to the type of the operation result as a type of the operation, because of LLVM IR in which instruction is-a value so the type is for both). Turning it this way raises an interesting question: most attributes are discardable from operations, if we say types are attributes, does it make them discardable, too? Are we getting a untyped IR then?

This is a significant expressivity extension though! Now, an SSA value has only a type. This sounds like it could have any attribute, how are those interpreted? In operations, the semantics of the operation defines the meaning of attributes, we don’t (yet) have such a semantic hook on types.

Overall, I share River’s concerns. This feels driven by some implementation internals, that may slightly simplify the core (and some “core enough” projects) at an expense of some conceptual complexity for the users, and potentially unintended expressivity extensions. I don’t expect there to be much code deduplication actually, templates and other type-based polymorphism are a thing. I think we should weigh the benefits and the costs extremely carefully before deciding one way or another.

+1 on this part specifically though. Back in the early days, we had a crazy mix of typed and untyped attributes, and I seem to remember a discussion where we concluded that we didn’t want types on attributes at all. This was never implemented and instead we’ve got a solution with most types being null. We didn’t have interfaces, or AttrDef to simplify the definition, so putting types as parameters of attributes that needed them required some amount of boring work. Now we are at a better place to achieve this.

2 Likes

I agree with @River707 and @ftynse.

While it may seen redundant to have types and attributes on the structural side of things, they don’t always have the same semantics. If we wanted to implement the type system using attributes, we’d have to give it some properties that, in the end, it would be “just like a type system”, but harder to implement because we’d be sharing those properties with other attributes.

Trying to create a generic “type system builder” within attributes may be a lot more than we want to do with MLIR. They’re already hard enough to do with bespoke implementations.

3 Likes

I see value in unifying a lot of the infrastructure behind Type and Attribute, include removing the Type that Attribute has: it seems like a wart to me which comes only from using these attribute as constant value: it is somehow convenient to have the same “type” attached to an attribute as the SSA value in this case, but even then I’m not fully convinced. Saying that every attribute must have a type expressible on SSA value does not seem obviously justified to me.

Having said that, considering that Type and Attribute are identical may be a step too far.

The way I see it is that we could consider that Type is a subset of Attribute: that is there are a lot of Attributes which don’t make sense as Type, and “an SSA value has a Type” instead of “an SSA has an attribute” seems like a valuable property to preserve to me.

3 Likes

We have noticed a similar redundancy in developing a formal semantics for MLIR (GitHub - opencompl/lean-mlir: embedding MLIR in LEAN), and this unification would help declaring semantics slightly more uniform; Conceptually, for us, these are both compile time information, and a Type is a special kind of Attribute that provides compile time typing information, as @mehdi_amini indicates.

3 Likes

:slight_smile:

Attribute::getType has ~6 uses between constFoldBinaryOp, vector::BroadCastOp::fold, and the ODS constraints for arith.constant, which could all be easily replaced by something else or an interface. There are a bunch of uses by the iterator for ElementsAttr, but these are required to be ShapedType anyways, so ElementsAttr should really define a ShapedType field instead.

Fundamentally, all attributes having a type feels like an archaic mechanism that’s only really used by a handful of built-in attributes. I think any solution that provide the memory savings (by making the Type a trailing object + a PointerIntPair<ImplType, bool>?) while keeping Attribute::getType is needlessly complex given that removing it would allow unifying a lot of the underlying infra.

Just like we have “intrinsic” operation attributes, types are “intrinsic” value attributes :wink:

(By the way, when are we splitting the attributes API to have “intrinsic” and “discardable” attributes?)

I mean, how are semantics in MLIR defined anyways? That aside, I can imagine one such “value attribute” being location in memory, e.g. %a {p = "BankA"} : i32, %b {p = "BankB"} : i32 = foo.bar() {p = "GPU"}. I will admit that this idea is more food-for-thought.

Yeah, maybe.

Would it make sense to make Type a subclass as attribute? TypeAttr can go away but it would still exist in a sense.

2 Likes

I also think that a full unification would be going too far. It is a good things that mlir is extensible, but we also benefit from having some structure on this to prevent it from truly being a bag of json.

That said, many of the points upthread resonate. I agree we should remove types from attributes: in addition to the memory cost, it is more state that has to be verified and kept in sync. An op like your.constantint needs to verify that the apint width matches the expected width of the ops return type, and that the attributes type is somehow compatible as well. This second thing is just redundant.

I would also love to see more unification of the infra as well. I think the idea above of making type inherit from attribute is the way to go. This move is how we unified away LocationAttr previously and it would make (type isa attr) but not ( attr isa type). This matches our current behavior in the asmparser etc, and eliminates TypeAttr as a thing.

I think it important to retain the mlir syntax distinction between types and attribute as well, this is useful structure for the generic dialects etc.

-Chris

3 Likes

It absolutely is!

My point is that operations have a mechanism to list their intrinsic attributes and specify their semantics. Sure, the specification just English, but it exists. Values have no such place. Even if we initially restrict the intrinsic attributes on values to be type attributes, the logical next step is to relax that and have arbitrary attributes and potentially different kinds of values.

This can also be achieved by

%a, %b = foo.bar() { p = "GPU", __result_args = [{p = "BankA"}, {p = "BankB"}]}

with some syntactic and API sugar. Doing so for block arguments is a bit more involved, especially for the non-entry block, but feasible on a parent operation. It suffices to extend and generalize the mechanism we have for attribute and result attributes of function definitions.

This sounds promising.

1 Like