Eliding dialect prefix for types

I think this has come up in circt and the LLVM dialect, but not sure if we ever found a solution of broad applicability/ergonomics.

In Torch-MLIR I have ops like:

%4:2 = torch.prim.TupleUnpack %3 : !torch.tuple<!torch.list<!torch.int>, !torch.list<!torch.int>> -> !torch.list<!torch.int>, !torch.list<!torch.int>
or
%3 = torch.aten.conv2d %arg0, %arg1, %arg2, %0, %1, %2, %int1 : !torch.vtensor, !torch.vtensor, !torch.vtensor, !torch.list<!torch.int>, !torch.list<!torch.int>, !torch.list<!torch.int>, !torch.int ->!torch.vtensor

And would like it to print as

%4:2 = torch.prim.TupleUnpack %3 : tuple<list<int>, list<int>> -> list<int>, list<int>
%3 = torch.aten.conv2d %arg0, %arg1, %arg2, %0, %1, %2, %int1 : vtensor, vtensor, vtensor, list<int>, list<int>, list<int>, int -> vtensor

That is, omitting the !torch. prefix for the types. We have a closed type system in the torch dialect so this is very ergonomic. I wonder if we could teach OpAsmOpInterface that all types parsed within a region have a dialect prefix omitted? Any other thoughts on a solution to this?

One added request: we have types like !torch.vtensor<[5,3],f32> where f32 is a builtin type used for element types, so having a way for the ValueTensorType parser to indicate that the dtype should be parsed with “builtin” dialect would help too, if the !torch dialect is the default prefix otherwise.

Thanks for bringing this up again!
I’m currently working on a few changes about how we handle attributes and type syntax right now. While it isn’t my main motivation, I’d like to be able to “deprivilege” the builtin types, and so just like I did with operations we should be able to get closer to what you want.
There are some things that aren’t trivial, around alias definition amongst other, and it is easy to end up with ambiguous syntaxes, so we need to thread carefully here…

1 Like

+1 on having some kind of solution here (but I’ve also arrived at the “tread carefully” guidance). Here’s an example of a default dialect that has a lot of type prefixing litter and ambiguity: https://github.com/google/iree/blob/main/llvm-external-projects/iree-dialects/test/iree_pydm/canonicalize/boxing.mlir

How attached are you about dropping the !? You mention that you want to achieve:

%4:2 = torch.prim.TupleUnpack %3 : tuple<list<int>, ...

but is it enough to keep the !?:

%4:2 = torch.prim.TupleUnpack %3 : !tuple<list<int>,

The reason this matters is that you’re likely to introduce a ton of ambiguities into the parser. It would be much more compositional to keep the !. Even with the ! we may want to introduce a whitespace rule for resolving !identifier<... as a type, but !identifier <... as two different tokens. Types can occur in a lot of different places in the grammar (incl in customer asm strings) so this sort of thing will pop up.

-Chris

+1 here, I don’t think we should drop the ! from types (at least not for anything printed/parsed by the core ssembly syntax).

I don’t think that works in general (not all types have a <> after them, e.g. !pdl.type). There are generally two main ambiguities when dropping the leading dialect name:

  • Conflicts with other dialects (could happen today with operation names too!!! but less frequently than attributes/types I would say)
    !foo.tensor<> → !tensor<>

  • Conflicts with aliases
    !pdl.type → !type

Having white space could solve the first, but not the second (though I think we could have a special syntax for aliases). That being said, I’m -1 on having whitespace be a delimiter for things like this, just feels very easy to get confused and get wrong.

– River

1 Like

The ! sigil is fine to keep as far as I am concerned :+1:

Right, I’m concerned about the other side of this: single word types that get conflated as having <>s as part of them that shouldn’t. For example, parsing:

myop !sometype <%0, %1, %2>

The way that parseExtendedSymbol works is it just looks for a Token::less after the exclaim_identifier. I’m suggesting that we require that there be no whitespace between the less and the typename to be parsed that way.

-Chris

Ooh, that’s a good point. Thanks for elaborating, in that situation I think it would make sense to differentiate the white space.

– River

Answering the original question, it is perfectly feasible with the current parser API, I did it twice for the LLVM dialect.

Case 1: closed type system (historical LLVM dialect state). This is trivial, since the type system is closed, call parseMyDialectType(Parser&) instead of calling Parser::parseType to get nested types. The parser for the outermost type parses the sigil and the dialect name automatically and then dispatches to parseMyDialectType.

Case 2: open type system that omits dialect prefix for the types from the same dialect as the outer type (current LLVM dialect state). When an elemental type is needed inside your dialect type parse, call try parse instead of just parse and fall back to parsing a keyword that corresponds to one of the known dialect types on failure. This lets us have !llvm.ptr<ptr<!custom.type>> . !llvm<!custom.container> interprets the inner “ptr” as !custom.ptr rather than !llvm.ptr if the custom dialect also uses this scheme, which is arguably the desired state.

It seems that the parser already does actually (noticed this while testing some ambiguities we have that aren’t well handled by the declarative assembly: https://bugs.llvm.org/show_bug.cgi?id=52484 ).

Maybe we shouldn’t allow !identifier/#identifier and always require the explicit delimiters? !identifier<> and #identifier<>?

FYI, I tried following @ftynse’s “Case 1”, which works for me. Unfortunately, TorchDialect::printType/parseType are non-static member functions, so I had to manually define them and call into a free function helper, which is a bit awkward. Otherwise, it worked pretty well.

My Torch-MLIR PR

I’m still eyeing removing even more !torch.'s, but it seems that is more involved – I want it to be consistent so that we aren’t always guessing whether omitting !torch. is allowed or not, which seems to require a custom torch.func op at least, since I don’t think that it is possible to get that sugar with func.func which we are still using :confused:

Is there an up to date summary on the current best recommendations for “I am creating a standalone IR that I want to be as pretty as possible”? IIRC there is a way to omit dialect prefixes for ops, but I don’t know the status for types. @mehdi_amini did you ever reach your goal for the tfg dialect having no dialect prefixes?