Rationale for not having tuple type operations in the main dialects?

The rationale is here MLIR Rationale - MLIR

1 Like

Any dialect that would benefit from using it is free to do so and should also define the ops that match how it uses the type. Which dialect do you feel should be extended?

I’m personally not sure if we had it to do over again, we would have made it a builtin type at all: it was aimed at bridging between non MLIR representations which used tuples to represent multi result operations – and was created at a time when dialect specific types were still pretty limited. Recent history has shown in cases I’m familiar with that defining exactly the type you want in a special purpose dialect is the way to go.

1 Like

Yeah, I consider having this as a builtin type with no operations to be an anti-pattern. There’s really not much you can do with a tuple if you define the semantics the way we do. Though it isn’t explicitly written down (which is a bug), but I think we all know implicitly: it is an immutable ordered collection with a compile-time fixed number of heterogeneously-typed elements. You can extract an element at a fixed compile-time index and you can make one from a fixed arity of appropriately typed elements. If we provide the type, we should provide those ops, because semantically everything a user is allowed to do with the type must be describable in terms of those ops.

I recently added the !torch.tuple<T1, T2> type and the type itself is literally 35 lines of code to implement. Having mlir::TupleType is actually a net negative for me because even with the appropriate using, we I still need to write Torch::TupleType to avoid ambiguity.

As per the rationale (which I think I wrote), the major benefit that can be gained by having a TupleType is removing the need to define this type per dialect that needs to use it, i.e. 1 definition vs N definitions. Dialects in most cases shouldn’t use it though, as multiple results are generally superior. Aside from that, the benefit of Tuple is that it is a nice “building block” type like ArrayAttr is for attributes.

That being said, we could just remove it. The main in-tree user I can see is the Vector dialect, but I’m not sure those usages should be kept (they seem to spawn from when MLIR was in TensorFlow). The main downside IMO is that we lose the building block type, but you can already have do ArrayAttr.

– River

1 Like

I think it is useful to have this, but the C++ namespace issue seems real. What about renaming it to mlir::BuiltinTupleType? Dialects that want to use it can then do a namespace mything { using TupleType = mlir::BuiltinTupleType;} or whatever

Most dialects are in their own namespace, builtin being an exception historically. We could also use the opportunity to make it explicit - this would be quite a bit of updates (well, a using mlir::builtin::TensorType or two could reduce that).

The standard dialect isn’t namespaced either, and I also had to name recently a TFFuncOp to avoid conflicting with FuncOp (from std dialect). The same issue likely exists with other op.
We could uniformly namespace dialect, we should just do it consistently!

Yes, the word “module” in the hardware design world means something very different than “mlir::ModuleOp” too. I wonder what a good first principles name for “mlir::ModuleOp” really is. It’s more like a “default top level” or something.

-Chris

I think we’ll run into many of these, each world has their own convention. Module being a unit of compilation is pretty standard. I don’t foresee a good, concise name that isn’t overloaded somewhere. So I think some convention would be used anyway.

Yes, I ran into this too for TensorType (!torch.tensor).

Let me challenge this in a friendly way :slight_smile: on a couple fronts:

  • Module isn’t a standard name for what MLIR is using it for. It means something specific in LLVM (which is where it was brought from, and generalized significantly in MLIR). In ML languages it means something completely different.
  • You’re right that it’s impossible to avoid all overloadings, but that doesn’t mean it should be named something wrong. :slight_smile:
  • I don’t think a “concise” name is required here.

I’m still not sure what the best answer would be, but that TopLevelOp would be better than ModuleOp - it communicates better, it is clearly a generic name (not bringing connotations from LLVM that don’t apply etc), and is less likely to conflict.

The problem I see is that it isn’t required to be the top level. You could go with DefaultTopLevelOp or BuiltinTopLevelOp if that were a concern.

WDYT?

-Chris

1 Like

If you were inside the torch namespace then you could just use TensorType, correct?

True … but it is very common :slight_smile: So making it verbose (IsolatedFromAboveSingleRegionedOpWhichHasNoOperandsAndThatIsCommonlyTopLevelButCanBeNested ) does have a usability cost. Perhaps that push folks to work on TopLevelLikeOp interface instead :man_shrugging:

The issue with naming here is: Module is not constrained to be a TopLevel, it can be nested and it can be nested inside a non-Module op. So even top level is a convention. What is core, I believe, is that it cannot access the external world (well unless we change SymbolTable conventions :slight_smile: ), no values passed in, no references to symbols outside it. It is an isolated unit of representation effectively (nothing in, nothing out - having an entry point etc are user level modelling). But I like 3 letter acronyms so naming isn’t my thing :slight_smile: (IRU :stuck_out_tongue: )

In various cases where we deal with dialects that bring their own top-level ops, I end up using this kind of BuiltinTopLevelOp terminology to differentiate. So if we’re voting on a rename, +1 on that. In general, I don’t think MLIR should have a ModuleOp. It is just a default top level op for cases where it is not valuable to bring your own (which is a lot of cases).

+1 to DefaultTopLevelOp as what we really mean here.
That said we’re not starting from scratch, is a renaming right now really worth it for ModuleOp? I guess it is purely mechanical so “low cost”?

No. It is a compiler error because of the ambiguity mlir::Foo vs my_ns::Foo (all MLIR compilers seem to be nested in namespace mlir or using it).

Well of course, but that’s not what is being proposed. We can all agree that we don’t want unnecessarily verbose names. In the Swift world, naming and API design is a well trod topic of discussion and has strong guidance much of which translates to other large scale APIs like MLIR core. The general approach is to optimize for “clarify” not “conciseness”. These two goals both argue against unnecessary verbosity, but differ when making something more concise comes at a cost of clarity. It also has carveouts for “term of art”, etc.

In this case, ModuleOp isn’t actually pervasive in the MLIR APIs (unlike Type, Operation, etc). It occurs at certain high level APIs (e.g. the parser), and those APIs are being generalized anyway. My argument is basically that “ModuleOp is the wrong name” and “the right name is a bit longer, but not ridiculous”, and “the longer name seems fine in practice”.

This seems very low cost and easy to phase in in a non-disruptive way. I’m happy to take this on, if folks agree that DefaultTopLevelOp feels right.

-Chris

1 Like

That doc is great! :slight_smile:

This seems to anchor the name to only one aspect of ModuleOp, in particular it names it after “how it is used” and capture this only partially. Can we brainstorm a bit more the intrinsic property and features a Module brings?
It is an unordered container (in theory, even though I suspect people rely on the order), with a symbol table, isolated from above. Anything else?
It could be builtin::PackageOp, builtin::BundleOp, builtin::GroupingOp, builtin::SymbolTableOp, builtin::ContainerOp.

Other angles?

I’m increasingly of the opinion that we should just define our own module operations in any dialect that has a reason to. I think it’s like 10 lines of code, and has all the benefits that you get to call it anything you want, and add any semantics. It’s common to want some sort of “version” / “special semantics marker” or other thing, which today get stored in “discardable attributes” but the information stored there is not actually discardable, so that is walking on thin ice anyway. The same is true for verifiers (it’s very hacky to do the verification in Dialect::verifyRegionArgAttribute keyed off a discardable attribute).

The same can be said for FuncOp too btw, but it has much more complicated structure so not easy to replicate in your own dialect. ModuleOp has about as much structure to it as ReturnOp, and there’s a reason that we recommend people to define their own ReturnOp-like thing (scf.yield, etc.) for their own use cases instead of piggy-backing on the std one, even though std.return and scf.yield are structurally identical.

1 Like

Given how I’ve seen it used, the only real property is that it holds a single block with no terminator. There’s no real contract. It’s just a list of ops. So I would call it SingleBlockOp or something vague like DefaultCompilationUnitOp.

E.g. I’ve seen uses that even have arbitrary “executable code” at the top level (we do that in npcomp to import an “object graph”, for example).

Yeah, I totally agree that the semantics are the important thing to start with. See also this post which touches on some of the issues with our current direction. I’m curious what others think about this.