Like atan, we would like to add atan2 to the standard dialect. Being able to express this in std would allow us to lower from lmho and materialize our approximation in the std dialect. We do not plan to add a standard lowering for atan2 but rather want to allow everyone to apply their own approximating transformation or lower to intrinsics where available.
We will likely want to add more trigonometric operations in the near future. The philosophy would be the same for all of these.
Iâm quite concerned about the ops explosion in the standard dialect: this is already causing us issue for mobile deployment of MLIR where we want only a few ops from standard but we bring ~5MB for it.
Is there a broad category of âmathsâ / âarithmeticâ ops which could be split in their own dialect?
@clattner, I am nor sure if I understand the question correctly.
An approximation for atan is, e.g., here and on that basis atan2 can be derived like here. These are just approximations though and there are surely many different ways of realising trigonometrics.
The aim here is not to give a standard definition or lowering but rather a representation in the standard dialect that can be transformed as needed. If desired, we could add a set of default approximations in an std to std pass.
Is it possible to split the Standard dialect ops into different libraries while still having them in the standard dialect? For eg. libStdMathOps, libStdCoreOps, libStdMemOps, ⌠and similarly the td/header files as well. This way you could always elide the dialect prefix for these ops, and one also gets the reduction in build times and library size. I understand no other dialect splits it this way, but the Std dialect also has the most ops among those in tree.
I thought those that arenât used donât get linked in. Are the MLIR libraries linked dynamically and thus copied over?
I donât understand either how this would work, or what is the advantage over splitting it in different dialects?
The unit of granularity in MLIR is very centered around dialects right now. You get all of a dialect as a unit: its ops, the canonicalization patterns, the interfaces, the printer/parsers, etc.
But you can depend on Ops but not depend on its Analyses, Transforms / Passes. I didnât understand what you meant by âYou get all of a dialectâ. For eg. you can depend on MLIRAffineOps without depending on MLIRAffineTransforms, likewise LmhloDialect vs LmhloPasses. Similarly, users may want to register the dialect and be aware of the ops without needing anything from the transforms passes.
Iâm not sure either whether this could work, but I see several advantages. You can elide the dialect prefix FWIW, and avoid getting into cyclic dependences while implementing canonicalization/rewrites (all of these ops work on the same type system / standard types and thus might use each other) and thus avoid being forced to move things from *Ops into *Analyses/*Transforms. Eg: what if you needed add/mul for some canonicalization of an AllocOp or another one to compute some sizes? Would you move add/mul into the math dialect? That would create a cyclic dep because the math dialect would have to depend on whatâs left in the std or whatever oneâd refactor it into. If you keep add/mul where it is, then youâd have logical partitioning issues?
I donât consider MLIRAffineTransforms to be âpart of the Dialectâ.
Iâm referring to the Dialect as in the C++ class and all of its dependency: this is already too big.
+1 for splitting the std dialect into more cohesive units.
Until then std would still be the only reasonable place to add trigonometrics. These ops are super simple so I canât imagine they impact the binary size noticeably.
Longer term, we could also think about supporting dialect extensions, so that we could still have a single dialect (like standard, gpu, SPIRV) but with sets of operations that can be optionally enabled. This would enable sharing a dialect prefix.
For now, +1 to adding these to standard, as the ops are fairly small and the lowering of them (which I suspect is the majority of size added) is optionally linked (as they lower to approximations or library functions).
+1 to splitting the std dialect. I donât think itâs anyhow more special than other in-tree dialects. Math and memory ops are good candidates to be separate dialects. Potentially with multiple levels of math, but itâs hard to draw the line. addi sounds like a really simple âcoreâ math operation and atan2 sounds much more specific, but not as much as std::cyl_bessel_jf from C++. Is shift_right_unsigned âcore mathâ enough to remain in âstdâ?
At the same time, discussing the splitting here partially derails the discussion about adding atan2. If there are other trigonometric ops in std, I donât see a reason not to add this one while we keep discussing the potential splitting.
Personally, I am not really fond of the prefix elision special case, especially if it used as a motivation to keep otherwise disparate ops together. I recall @clattnerâs argument that nobody wants to read âstd.loadâ instead of âloadâ when the dialect splitting was discussed more than a year ago. Today, with âaffine.loadâ, âllvm.loadâ and likely a handful of other out-of-tree dialects with custom âloadsâ, Iâd actually prefer to see an explicit std. prefix. This sounds like a matter of the dialect mix one is most often exposed to. If the majority of ops are in std, skipping the prefix sounds like an improvement. If there is a deep mix of linalg, scf, affine, std and gpu, not having a prefix on some operations actually breaks the reading flow. If textual IR verbosity is an issue, we can find a mechanism to address that in pretty-printing for all dialects.
-side note-
May be a solution is to have an âintâ dialect and a âfloatâ dialect (neither is a reserved name), so we get reasonably readable int.add and float.add and similar multi-level naming schemes like int.unsigned.div and int.unsigned.shift.right. Memory operations, which are actually memref operations can just as well use mem prefix if we canât make them use memref. What remains are in std after these are predominantly control flow-related operations: atomics, branches, calls, DMA and utilities like constant. Constants and calls sound âcore enoughâ to me to remain in âstdâ, in which we can also pull ModuleOp and FuncOp that are currently placed in a half-hidden âbuilt-inâ dialect.
-end of side note-
I have strong concerns in general with the trend of adding things and making it someone elseâs problem to clean it up later. In particular this is a long-lasting problem, and usually preventing making it worse is the only forcing function to make this kind of cleanup happening.
So IMO: the balance between not preventing forward progress and still not accumulating technical debt is to have a strong commitment from the folks that need to take on the tech debt to enable progress to also drive the long term cleanup.
FWIW, Iâm not necessarily a big fan of splitting out the libraries. I see some advantages but also some disadvantages: it makes most pieces of IR more verbose, it increases the dependencies that need to be managed, for which there is already a large amount of not-quite-boilerplate necessary to get things to work (between the build system(s), dialect registration, dependent dialects, namespaces, etc.
@mehdi_amini If a truly minimal set of operations is necessary, perhaps you need a specialized deployment dialect that contains only those operations that you want for deployment?
I am a big -1 on such splitting based on types like int / float. We already have the type printed right on the op - both generic and custom, and moreover a suffix or in the op name as well. If itâs ... = addi ... : i32, if itâs not on int, what else would it be on?! The int./float. is only adding redundancy here. I think a more serious downside is that you would be separating ops that should logically be together (due to their similar imperative actions) into separate dialects.
I would be fascinated to see the numbers about this (such as binary size before/after your patch). I donât have any intuition for how much binary size an op takes!
I did a pass through std, and here is my rough breakdown, with some vague recommendations off the top of my head (please critique!). I think if we can get consensus on the âmathâ dialect at least then there would be no objections to adding atan2 there?
I think the more important question if we split is the axis. What is the taxonomy and the rules, rather than specific instances. E.g., log2 is no more math than divf, so if we have a math category, then excluding some math ops but not others seems odd (now a trig, algebraic, transcendental, or some such as categories would be different). Why have some math in std while others in math, why not have math.int and math.float? (Not saying we should that seems bad IMHO, but is a taxonomy which is easy to follow).
I think the proposal of no standard at all, is as âstandardâ creates an expectation (someone also asked âis builtin dialect more standard than standard?â which I think highlights confusion). Doesnât mean it has to be finely split (definitely boilerplate is a valid concern), it/dialects should have a defining criteria for what is and isnât there as much as practically possible. Now I canât even categorize my own emails to this extend (labels for the win, but most donât end in one label), but we could get something good enough.
Why ambiguous? If you have a addi then without this, you canât increment by 1 without this. So what would we save if we moved this elsewhere? (seems more std than divf ) . Or are you proposing constanti, constantf, constantt and so splitting on the type?
So shape dialect as these are related to shape queries.
I think this also shows the problem: what are we trying to achieve with the breakdown of std? More logically consistent components? (What if we could break std into âsubâ dialects but then be able to load std dialect and get all of them to avoid boilerplate). Removal of seemingly privileged dialect? (Letâs rename it to Stuff). Or for reduction in size? (In which case, are we attempting to create some dependency diagram of use and do divisions based on often co-occurence? Or just expose generally a way to load parts of a dialect?).
I think with a goal set and a rationale for the why, the what could become easier.