I’d like to discuss one small proposal - moving ReturnOp from Standard Dialect to BuiltIn Dialect.
The ReturnOp looks quite couples with FuncOp from BuiltIn Dialect. The FuncOp requires terminator at the end and ReturnOp looks quite natural in this case. Now, it introduces a dependency on Standard Dialect. If the ReturnOp is moved to BuiltIn Dialect, some code might get rid of Standard Dialect dependency.
Thanks for the proposal! Though it is a very large -1 from me. The builtin dialect is not something I see having more ops added to it. If anything we are actively trying to move towards a future where there are no operations in it at all (a few kinks to work out here, but these should hopefully be addressed soonish). Adding operations to builtin to reduce the dependency overhead of the standard dialect feels like a band-aid for the real problem, which is the fact that the standard dialect is so grossly oversaturated with operations(often at conflicting levels of abstraction) that it isn’t really that tenable of a dependency for many users. The real solution is to break up standard into more manageable dialects that actually focus on a specific abstraction/purpose/etc. such that there is much greater reusability of the operations currently in there. This is something that has been discussed in a broader sense for a long time, but a few of us have been trying to formulate a real process for this recently. The results of that should be posted on discourse soon, but I’m not sure of an exact time(I’m technically OOO right now).
The std.return operation is declared with trait HasParent<"FuncOp">, what, from my point of view, makes it a companion terminator for builin.func operation. The fact that they lives in different dialects sometimes complicates the things (legality patterns, for example).
Why not to put it to the builtin dialect right near to FuncOp? They are used together in 99.99% cases, IMHO.
This makes sense to me. If no more ops are to be added to the builtin dialect, then FuncOp should come back to the std dialect, which would look more awkward. If the goal is for the builtin dialect to go away, I’m not sure then why FuncOp was moved to builtin in the first place. It is odder that an op’s terminator is in a dialect that is layered outside of that op’s dialect: it would be okay to have an std op’s terminator in the builtin dialect (if not in std itself), but not the other way around!
I still feel the right thing here is for the ReturnOp to go into builtin for the same reason FuncOp did: both FuncOp and ReturnOp can move out together when they move.
FuncOp was never in the std dialect, it has always been in the builtin. The goal isn’t for the builtin dialect to go away, the goal is to only have things in there that are truly core to the IR (for some definition). IMO FuncOp does not satisfy that requirement, so it should be moved out (not necessarily to the decaying corpse of standard).
ReturnOp is not the only terminator that can be used with a FuncOp. The reason why we put FuncOp in the builtin dialect when it became an operation, is because we expected Func as a concept to be generally applicable for different types of users, e.g. currently PDLInterp (but that may change at some point), and SPIRV/GPU for a time, and several downstream. The current situation is that users are more inclined(and encouraged) to define their own FuncOp, so the amount of reuse across different abstractions has changed (lower than before).
The reasons for having FuncOp in builtin don’t make sense anymore, so moving ReturnOp just furthers adds to the problem (while also burdening all of the users that don’t use that op along the way).
My still strongly held position is that the bar for putting anything in builtin is extremely high, and it has to make sense for the IR as a whole. ReturnOp doesn’t fit that bar at all, FuncOp kind of did for a time but doesn’t anymore. Builtin is not like Standard in the way of: “well, what’s one more op? we need to split anyway”, builtin is already lean and shouldn’t add operations that we intend to eventually move out.
+1 to this (moving FuncOp to std) if the return op shouldn’t be added to builtin. I feel it is still better than the current situation.
I would disagree with this part. I’d says users are and should be more inclined and encouraged to reuse existing abstractions in MLIR upstream dialects unless a need arises to deviate from them or to define their own ops. func, return, constant are for example pervasively used in various lowering paths and at various layers of abstraction within those lowering pipelines without a need for anything new/different. (For eg. I often don’t need another FuncOp until I hit the LLVM or the GPU dialect.) It’s the same argument one would apply to other dialect ops like those of scf and memref. There isn’t really a need to simply create clones of ops just because one wants something that belongs to a specific dialect. My stand at this current juncture is still to argue for ops like module, func, return, br, cond_br, and constant to either all exist in builtin or in std.
That split is still in progress… I agree that it makes sense now to have func and return in the same dialect. As for duplicating, or not, the operations, I would consider it as a part of a more general tradeoff in library layering. We did duplicate dim in memref and tensor dialects, because it seems to be a better alternative to having one of the dialects depend on another or to having an additional dialect that only contains dim.
IMO, func, return, br and cond_br can all fit into a “control flow” dialect, either a new one, or SCF with “standard control flow” as backronym.
I’d support this. Having func and return together makes sense.
I am not pro moving func into std, even temporarily to avoid unnecessary churn (the benefit isn’t clear, in particular when compared to the churn)
I don’t think that SCF and CFG stuff make sense lumped together. There are really three different things:
func and return
cfg stuff (br/cond_br)
structured control flow stuff loops.
FWIW, on the circuit side of things we dealt with #1 by making a “hw” (for hardware) dialect that contains hw.module/hw.output and other things that correspond to functions/return. Maybe we should just christen a “sw” dialect with the basics of imperative concepts and put func/return in there. You can then put br/cond_br in a cfg dialect
Why not to leave builtin.func as it is today? We have builtin.module as a generic operation for IR organization, at least as I can see it. The builtin.func, IMHO, also makes sense in the same term (second level of IR organization). Some pipelines might use them, when no extra detailing is needed (in contrast to hw.func or gpu.func). And builtin.return will be just a companion for builtin.func, since it requires some terminator.
This works for me too. The only issue is the burden of defining new dialects with a couple of ops each as I don’t see many more operations that would fit in either “sw” (maybe call/indirect_call should live with functions), or “cfg” (switch?, do we need a landingpad equivalent?).
From what I’ve seen, far more IR instantiations use module than func. Modules sort of make sense in domains where the concept of functions is not well defined. Furthermore, modules can be nested and addressed through names, so there is really little interest in having functions as the only blessed second (but not third, fourth, …!) level of IR organization. IMO, this would also open the door for feature-creeping the builtin dialect further. Should we also move std.call/indirect_call? Add globals? LLVM-style module metadata ops?
Let’s go the other way: why not move module out of builtin? We’ve done enough work to make it less privileged so there may be no real need for it to be builtin anymore.
The only reason is that it is the default parsed operation. We discussed this in another thread, I believe that the only facility we get out of this is the ability to not specify the top-level operation in the textual file: when parsing multiple functions we automatically wrap them into a module. The alternative would be to not do that and require explicit wrapping.
For context, CIRCT doesn’t use any of the func/return/etc stuff or any of the std ops. We do use builtin.module as the top level, but could define a hw.container or something. That would make more more sense to us actually because mlir::ModuleOp and hw::HWModuleOp is annoying.
The HW dialect therefore contains these things (and nothing more, I just checked):
hw.output (analog to return), hw.module (analog to func) + variants for external and procedurally generated modules, and hw.instance (analog of call)
Types: hw.array, hw.unpackedarray, hw.inout, hw.struct, hw.union, and hw.typealias (allowing sugar types like clang). We use the standard integer type.
Operations for aggregates: hw.array_create/concat/slice/get. hw.struct_create/extract/inject, hw.union_create/extract.
Misc Ops: hw.constant and hw.bitcast.
It does not contain any control flow (equivalent to if/while/branch/etc), or logic (equivalent to add/sub/mul) or memory (equiv of alloca/load/store etc) operations. Those are modeled by other dialects.
This level of abstraction feels right to me, and is working well. We have other dialects that compose on top of this and provide their own stuff and multiple levels of abstraction, but standardizing this stuff has been very useful.
We also have some interesting ways of handling metadata and out of bound info, @darthscsi would be the best one to talk about how “bind” works, and we had a great discussion with @jdd about higher level dialects lowering to hw.instance even though they have out of band data they need to generate (TCL in his case, JSON in SiFive’s case) that needs to be correlated to generated Verilog.
I’m not sure if the two of them are interested in putting together slides, but I think the broader MLIR community would be interested in it!
The circt side is definitely hitting a lot of non-local operations from both the FIRRTL front-end and from the verilog backend. There’s a few classes of behaviors we’ve run into and need to support, and likely more we haven’t hit.
Instance paths as first-class entities. (instances being analogous to call operations structurally). Here a path through the instance hierarchy (call graph) is explicitly used as arguments to operations. bind, cross-module references, force statements, and others in verilog operate this way. We haven’t settled on how to represent this cleanly. There are a couple ideas floating around. For the most part, these paths need to be nameable, but the the non-endpoints aren’t constrained, as long as you can look up what the path corresponds to at code emission time.
Out-of-line emission. Here, you logically have some code, so far instances and variable reads and writes, which you can represent at the point of effect (assuming it’s not external), which is then generated in some other part of the output. In software, this would look like being able to add a function call to the body of a different function, without affecting the output of that other function. Yes it’s weird, but fairly important for verification. The operations mentioned in (1) have this kind of form also (usually verilog allows either “all instances of a named thing”, which we support, or “named path specifying a subset of the named thing”, which we don’t). Here, we are attaching symbols to operations, placing the operations at the point of effect (so data-flow is correct) with a flag to not emit them, then placing a referencing operation which points to the symbol. So a system verilog bind statement, which is a remote instantiation, exists as an actual instance op at the point of effect with an flag to not emit it, but also as an op which points to that instance via symbols at the point of emission. This also means we have symbols being defined and accessed in non-standard ways, which is the subject of another post I need to do. (Side effect: instances (call-like), registers (static-like), wires (alloca-like), and others wind up optionally defining symbols inside modules (function-like)).
I think that discussion has led to the parser API being redone to be template-parameterized with the class of the wrapper op to produce. There are some constraints on what it can be: single-region/block, implicit-terminator. Actually, the versions producing ModuleOp are commented as deprecated llvm-project/Parser.h at main · llvm/llvm-project · GitHub, but I don’t think somebody finalized the cleanup. I haven’t thought how to phrase this in LangRef though.
It looks like this should be possible to implement right now.
This totally makes sense to me in CIRCT context.
A hypothetical sw dialect would, at least in the first approximation, have even less ops and no types. I am not saying this is necessarily problematic, just unusual compared to the scope of most other in-tree dialects.
We don’t need the work to be finished to discuss, even in ODMs, although we tend to have more and more formal conference-style presentations. My personal preference is for much more open-ended and incomplete material that leaves more room for discussion. Just pasting parts of the proposal in larger font could do it.
We had to come up with a model for module-level metadata in the LLVM dialect, including cross/self-referencing capabilities, so it would be nice to see if there are alternatives or generalizations.