[RFC] Don't use strings for the "ugly"/"generic" form of Attributes/Types

We currently have two textual forms for attributes/types: an ugly or generic form that wraps all of the contents within a string: !dialect<"type contents">, and a “pretty” form for when there is a leading identifier and the contents take an acceptable form(i.e. basically just checking balanced punctuation): !foo.type or !foo.type<contents>. This RFC is mostly focused on the first form, i.e. the ugly/generic form.

I’d really love to remove the string wrapper and enforce that all attribute/type formats have an acceptable form (basically just balanced punctuation). This essentially would mean that attributes/types that use incompatible formats will need to manually wrap the incompatible sections within a string/change to something else, the framework would no longer do it automatically. For reference, no upstream dialects rely on this at all, and it also makes the MLIR parser really disgusting and unnecessarily complex. To support the nested string formats, we currently have complex nested lexer/parser/SourceMgr logic, which in the case of diagnostics tries to magically restitch the location to the one in the original source (which often ends up just being wrong). This magic nested parser logic also makes it more difficult/impossible to compose attribute/type parsers with operation parsers. I’ve sent out ⚙ D118505 [mlir:Parser] Don't use strings for the "ugly" form of Attribute/Type syntax to start cleaning this up. This really feels like debt that we’ve just carried forward from the first implementation of attributes/types, and I’d like to get rid of it.

Is anyone really strongly attached to the current thing?

– River


I don’t fully understand the change, so apologies if this is a silly question. Would this still allow a generic representation of all attributes that can be parsed even when the dialect is unknown?

I believe those would continue to work and silently become an OpaqueAttribute/Type.

Iiuc, this would remove a compatibility feature from the very early days from when we imposed no norms an attribute/type printed forms. Arbitrary example: !myd.mytype<">)]} unbalanced punctuation is cool">.

All of the upstream attributes/types were fixed long ago to follow the syntax constraints. I haven’t seen one of these forms in the wild in a long time (outside of bugs when developing).


And I believe this is only balanced outermost punctuation, so range<[0, 5)> is fine/low barrier on matching.


Right now it is balanced inner as well (I think), though we could tweak this if necessary (the most important balancing is the outer <>).

1 Like

+1, I agree that this is esoteric and probably completely unused. It makes sense to push the burden into a dialect that wants to do weird things, rather than making the core more complex.