De-privileging ModuleOp in translation APIs

I’m going to contend that there are a few sharp edges for this feature around various tools/APIs and if we improve those incrementally, a bunch of the issues go away (or the solutions become more obvious). It’s hard to see past the rough edges at present (at least for those of us who didn’t have a hand in designing it and have a correspondingly better mental model).

I believe this is (or was) a limitation – at least a year+ ago when I had to write a gross workaround (and at the time River said it was a limitation). Haven’t checked recently. It’d be nice to confirm/fix – I’ve seen evidence of folks stumbling on this. Might just need a canonical example of a right way to do such a replacement.

I think we can/should do better than this and the default opt may benefit from some support for custom top levels – we have in tree tests that are awkward to write or not actually testing the paths used in production code due to weaknesses here. Not the highest priority - but I suspect the msvc failures on my patch stem from under tested top level module code (haven’t triaged completely) – and it is pretty hard to exercise correctly now.

1 Like

I actually wonder if you didn’t hit a limitation of the pass API: you can’t replace the op that the pass is operation on, and so intrinsically you can’t change the top-level operation since there can’t be a pass operating above.

Sure, but things being hard aren’t necessarily giving me any action item: there are things that are intrinsically hard and other that are artificial. I’ve been trying to understand what’s artificial here to extract action items and derive possible improvements for MLIR, but I’m not there yet in this thread.

I’m a bit concerned about how folks are talking about this whole “client of the parser API can specify the top level op to use” API. This seems quite useful to me (as well as the “fill in this block with any ops you parse”) API for things where you want to work with MLIR fragments and some other use cases.

However, I’m concerned that people will start using this and expecting some IR to always have a specific top level op. This makes the .mlir file context sensitive. What does:

a.op() : () -> ()
b.op() : () -> ()

mean? Is this two ops in a mlir::ModuleOp, or is this something else? If it is something else, what does this mean for virtually every tool (mlir-opt/translate/reduce/… etc) that use the parser? Are they now contextually sensitive and do they get command line options to control what sort of op to parse into? This seems like it will turn into a mess that unravels many of the benefits of having a well defined and self contained IR.

I’d personally much rather have a simpler thing: outside the fancy use cases, we always have a consistent top level operation (e.g. mlir::TopLevelOp), and that op can only appear at the top level (checked by its verifier hook). The asmparser/printer would then special case it: elide it when printing when it has no attributes, synthesize it when not present. I think this will keep a lot of consistency.

I think the flip side of that is that we really should have an llvm.module etc.

-Chris

I think that is a reasonable pov. It’s a bit hard to see right now because the existing ModuleOp is doing double duty as both a top level structural component and the semantic root for a lot of dialects which don’t bring their own (ie. There should be an llvm.module as you point out). I don’t think there is a problem with MLIR having a ModuleOp for folks that want something and don’t care – but it should probably be explicitly used. Whereas the (new?) TopLevelOp is implied and non optional.

Fixing this will not be a small change.

+1 Chris. I think that really gets to the heart of it: what is the top-level production of the grammar of a .mlir file? And I think we want it to be top-level ::= operation* (with some random attribute/type alias stuff thrown in). And the semantics of parsing an MLIR file is that the 0+ operations that are parsed at the top level go into an mlir::TopLevelOp.

And then separate from the parsing grammar issue, yes, trying to get the ecosystem to use custom “module” types for their semantically-charged compilation units is a good general direction too.

@clattner it seems to me that what you’re describing here is rooted into an assumption: that we have to make it valid / allow to have multiple top-level operations in a .mlir file. However this is just a choice that we can balance against everything else, there is nothing fundamental here.

We could just reject this as invalid textual IR:

a.op() : () -> ()
b.op() : () -> ()

The printer would always never produces this anyway, we’re accepting this form only for the sake of simplifying test case writing (I am unaware of any real use for this otherwise), but is it really worth it?
Instead we can require the user to write:

my.module {

a.op() : () -> ()
b.op() : () -> ()

}

Which makes it totally unambiguous, and always parses and round-trips the same way. This matches also the IR produced by MLIR.

We can also add a TopLevelOp that is always the root of the IR, but that seems like orthogonal: we’d likely want to have a module (or a custom dialect container) inside the TopLevelOp anyway, and I’d be wary of allowing the TopLevelOp to do more than allowing a single top-level module-like/container op there.

1 Like

Yes I agree with you Mehdi, that’s another good way to solve the problem, and that would be a much more conceptually clean model.

The only downside I see is that effectively every file would have a redundant x.y { ... } wrapped around it, which should really force an extra level of (pointless) indented. However, we can solve that by nuking the indent at the top level explicitly.