[Standard] Add `atan2` to the standard dialect

I agree with the above. This is in line with my comment that we should keep discussing the splitting, as in now not later. I understand that we got into this situation by postponing the required discussion for too long, but the situation where we ask the person proposing a relatively small change to lead a (relatively) contentious redesign of a heavily-relied-upon piece of the project is kafkaesque. Luckily in this case, we seem to have sufficient interested parties to reach a good compromise. But in general, it could be helpful to keep track of some debt metric and periodically reconsider debt-heavy pieces so that we pay it back intentionally rather than expect it as a side effect of a seemingly random change.

The discussion on the split seems active enough now (btw, can we split part of the posts into a new thread?) and we may able to share work somehow?

Vectors, tensors or other containers (via type interfaces) thereof?

Exactly the same amount as the i and f suffixes. It does add some verbosity though.

I think any separation can be argued for or against in general. I have a target with only integer operations supported, for example, and it does not make sense at all to have floating-point operations in the mix.

I think I like this separation provided each dialects has guidelines on what should be included. For arithmetic, it is probably reasonable to say we have enough in std and anything new should be discussed, but for separate dialects we would need to have the op inclusion principles as we usually require for new dialects. For example, should math include non-floating-point operations (if not, maybe call it mathf); should it include all of C and C++ ā€œspecial mathā€ functions?

Naming nit: we cannot use memref and complex as dialect namespaces because they are keywords.

Operations that intentionally work across type such as select and constant are probably the best candidates for being in ā€œstdā€.

+1. At the same time, we can use the existing ops to test the axis.

Iā€™d argue that different stakeholders might have different reasons :slight_smile: Letā€™s maybe collect these. Personally, I feel std is more of ā€œwe-couldnā€™t-find-where-else-to-put-this-opā€ dialect than ā€œstandardā€, and it is missing the overall inclusion principle/objective most other dialect have. Having huge dialects is also a problem for modularity: on one hand, one doesnā€™t know where to look for stuff they need, on the other hand, they pay the cost of the whole dialect by using one or two things. The two latter items can be addressed separately.

Iā€™d instead argue that they are in std because they all work on standard types which include int, float, and vectors and tensors of those (even if one doesnā€™t want to use the term standard, these are all really core types). This also means trying to split it all completely would be fraught with the risk of creating cyclic dependences, duplication of similar logic/helper methods, separation of logically connected things into different dialects, and also a inconsistent partitioning logically and naming wise. One could potentially pull out ops that exclusively work on the tensor class (due to a much higher-level of abstraction) like @_sean_silva lists, but thatā€™s a very thin slice. I donā€™t think any of the concerns/questions in my third post, or those described in more general terms by @jpienaar and @stephenneuendorffer have been addressed at all.

I think @_sean_silvaā€™s partitioning looked great to start with keeping std still big enough, but by the end of it (esp. the last section), it pretty much provided a reason why we shouldnā€™t split.

Separately, coming back to the tan inverse patch, I donā€™t think that patch has to wait. It canā€™t be the responsibility of one author to come up with the right partitioning of std or to drive such an effort. We could form a working group to continue discussing this, but I donā€™t think any new op contributions should stop until there is consensus on a proposal. Only post that, I feel itā€™s fair to place the burden on those contributing more ops. At this point, itā€™s not even clear what significant problem the partitioning is trying to solve, and how it outweighs its downsides.

Another reason why Iā€™d like to untangle all this complexity. These types are part of the builtin dialect (llvm-project/mlir/lib/IR/MLIRContext.cpp at main Ā· llvm/llvm-project Ā· GitHub), not standard dialect. This conflation between builtin ā€œstandardā€ types and the standard dialect due to unfortunate naming makes it hard to grasp.

I agree that operating on builtin (aka standard) types is a necessary condition for inclusion in the standard dialect, but not a sufficient one. Otherwise, Affine (works on index and memref), Vector (works on vector and integer/float primitives), Linalg (works on memref and tensor), SCF (works on integer primitives) and many others would be merged into Standard. FWIW, HLO works on builtin tensors, should all these Ops move to Standard too?

I donā€™t think there is a high risk of running into cyclic dependencies with op definitions. We have very few ops that conceptually depend on other ops, e.g. loops on their terminators. Transformations will have to be more explicit about their op dependencies, which aligns with the recent changes in dialect registration and loading. And I agree that we should take into account the overall cost of having more smaller libraries or less larger ones. There are benefits to both models, e.g., I donā€™t think today anybody can commit to maintaining the standard dialect but @frgossen and friends would probably be fine with maintaining the ā€œmathā€ dialect while I wouldnā€™t mind sharing the responsibility of the memref dialect with someone.

I would be interested to see a precise definition of what is the logical connection between ops currently in standard. Agreeing on such definition will solve at least half of the issues we discuss here :wink:

+1 to all of the above.

Do you have a binary size breakdown? Besides the overall software spidey-sense of not having monolithic components, that seems like the only other true requirement raised in the dialect splitting discussion so far.

I think @bondhugula brings up a really good point that the set of types that are in ā€œstdā€/ā€œbuiltinā€ is playing a big role in the coupling we have in std.

It has already been hinted at in discussion like [RFC] Tensors with unknown element types - #21 by tatianashp that the built-in tensor type is just one opinion in the space of possible tensor types. I think memref is somewhat similar in that regard.

If you remove all operations related to tensor and memref from std (and remove std.assert), then what you have is something like ā€œbasic_isaā€ that is, roughly speaking, ā€œMLIRā€™s cleaned up view of computation at the LLVM IR level (including vectors), without an opinion taken about how memory is representedā€.

libStandardOps.so grows from 6.810 Mb to 6.875 Mb. This is just the op definition and documentation, no folding or other special implementation.

I understand this discussion as it is okay to land the atan2 in std for now. I will happily help move this and other ops to a dedicated math dialect when we decide what that one should look like.

Is that from a debug build? Itā€™s strange that just an opā€™s definition can add 65 KB to the library! My libMLIRStandardOps.so.12git is just 2.1 MB (release build with asserts).

It was a fast build. With -c opt this one op increases binary size of libStandardOps.so by 22 KB (2.038272 MB w/o, 2.060024 MB with atan2). Itā€™s still a lot more than I thought.

Agreed, perhaps bloaty (https://github.com/google/bloaty) can help in diffing. Reducing the binary size is a larger effort and agreed with Alex that we probably should split these into different threads.