[RFC] Split the `int` and `float` dialects from `std`

Part of splitting the std dialect.

int Dialect Scope and Goal

The int dialect is intended to contain basic integer operations (e.g. register-to-register ops on widely available hardware). The int dialect is to have builtin as its only dependency. The primary smoke test of adding ops into the int dialect is that they cannot introduce any dependencies and that they must operate only on integer types.

Ops excluded from the int dialect would be:

  • sitofp and uitofp because they introduce floating point types

Ops included in the dialect include:

  • Arithmetic ops, e.g. addi, divi_unsigned, and floordiv_signed
  • Bitwise and shift ops, e.g. andi and shift_left
  • index_cast because it converts between integer types

Ops to be Moved from std

std.addi -> int.add
std.subi -> int.sub
std.muli -> int.mul
std.divi_unsigned -> int.div.unsigned
std.divi_signed -> int.div.signed
std.ceildivi_signed -> int.ceil_div.signed
std.floordivi_signed -> int.floor_div.signed
std.remi_unsigned -> int.rem.unsigned
std.remi_signed -> int.rem.signed
std.and -> int.and
std.or -> int.or
std.xor -> int.xor
std.cmpi -> int.cmp
std.shift_left -> int.shift_left
std.shift_right_signed -> int.shift_right.signed
std.shift_right_unsigned -> int.shift_right.unsigned
std.zexti -> int.zext
std.sexti -> int.sext
std.trunci -> int.trunc
std.index_cast -> int.index_cast

float Dialect Scope and Goal

The float dialect is intended to contain basic floating-point operations. The float dialect will have only builtin as a dependency. Ops in float will operate primarily on floating point types but may accept or return integer arguments. Transcendental functions or functions expressed in terms of approximate real arithmetic belong elsewhere.

Ops excluded from float would be:

  • Trigonometric functions and roots
  • expf

Ops included in float would be:

  • Cast operations, including sitofp and fptoui
  • Basic arithmetic operations: e.g. addf and mulf
  • ceilf and floorf

Ops to be Moved from std

std.addf -> float.add
std.subf -> float.sub
std.mulf -> float.mul
std.fmaf -> float.fma
std.divf -> float.div
std.remf -> float.rem
std.ceilf -> float.ceil
std.floorf -> float.floor
std.cmpf -> float.cmp
std.copysign -> float.copysign
std.fpext -> float.ext
std.fptrunc -> float.trunc
std.negf -> float.neg
std.absf -> float.abs
std.uitofp -> float.from_ui
std.sitofp -> float.from_si
std.fptosi -> float.to_ui
std.fptoui -> float.to_si
std.bitcast -> float.bitcast

Development Process

The dialects will be created one at a time, with the int dialect first, to test the waters for how complex the migration will be. The int dialect will be created in two patches to help the review process: one that focuses on the design of the dialect, and one that deals with the details of migrating the code base over. Both patches will be pushed atomically, so that the redundancy created by the first patch will not exist beyond it.

The same process will be repeated for the float dialect.

3 Likes

In the original thread, splitting to int and float was listed as “one possible” option, and I think the parts that were fairly non-controverisal have been done.

Can we revisit the rationale for this specific splitting?

My personal opinion: we may be one degree too far in the taxonomy of all things by splitting things like this. One sign of this is the sitofp and uitofp placement. Taking further moves to retire the std dialect make sense to me, but I see the natural home of the ops listed here as being a combined math dialect, possibly with some sanitized naming as presented.

3 Likes

Since some of these names are moving away from historical abbreviations, perhaps it would be a good time to consider moving away from the abbreviations?

int.cmp → int.compare
int.zext → int.extend.unsigned
int.sext → int.extend.signed
float.cmp → float.compare
float.neg → float.negate
float.ext → float.extend
float.to/from_* → float.from_unsigned_int or float.from_uint or float.from_int.unsigned?

Full disclosure: abbreviations are my personal pet peeve… I can see the value of some of them like add/sub/mul, but some of the less used ones should be spelled out IMO.

There’s a couple things here:

  1. float dialect can conceivably depend on int (imagine a folder for an ilogb op, which returns an integer), but int should not depend on float for any conceivable reason (int only depends on builtin). Putting them in the same dialect creates a spurious bidirectional link.
  2. It honestly just reads better: int.add + float.add instead of arithmetic.addi and arithmetic.addf. There’s nothing preventing us from retaining float.sitofp as the name if that seems nicer than the to_/from_ terminology (I personally find the to/from terminology confusing).
  3. Especially for integers, there’s a very large number of canonicalization patterns (LLVM has multiple multi-thousand-line files of them), so don’t let the seemingly small number of ops deceive in terms of the total amount of functionality in the dialect in the fullness of time.
  4. Consider that a bunch of ops like popcnt, clz, ctz, etc. that we don’t have, and which seem to be more natural for popcnt and int.add to live in the same dialect than int.add and float.add.

Was there anything besides the int<->float conversion ops that set off your spidey senses? Personally, I don’t think that a small number of interconversion ops should contribute fundamentally to the layering we choose here (especially when likely future ops like ilogb/frexp/etc. obviously belong in float and they naturally operate on integer datatypes).

Well, the math dialect is somewhat unfortunately named (arguably everything we are doing is “math” by some definition – I think I suggested that name and I’m sorry). But the original idea for math was to hold things like atan2 and other transcendentals (things that you usually calculate with a taylor series or other algorithm whose fundamental datatype is an approximate real, rather than an integer like you would use to implement/emulate addf or nextafter). So we would want to keep at least those separate from the ops being considered here.

3 Likes

I have a slight concern about this proposal in that the names “int” and “float” are very generic dialect names: there are lots of ways to deal with integers and floats, it would be great it we could brand these somehow, even if it were “stdint.add” or something.

I’d actually recommend going the opposite way, and get rid of the weirdly verbose shift_left op in favor of shl. There is no clarity benefit to the longer names, and it leads to a ton of verbosity particularly in C++ code when you’re talking about dyn_cast<ExtendUnsignedOp>(.. instead of dyn_cast<ZExtOp>(... Has anyone ever been confused by this? What do longer names achieve?

CIRCT went through this with the benefit of being able to look at MLIR and LLVM and came up with a very nicely consistent set of names for both signed and unsigned variants, things like comb.add and comb.xor etc, as well as comb.shl, comb.shru/comb.shrs, comb.divu/comb.divs, etc. Putting the unsigned vs signed letter as a suffix makes the operators sort correctly, and avoids the mistakes LLVM made with calling signed shifts “arithmetic” shifts in some places (just call them signed for consistency).

-Chris

Not to be inflammatory but my spidey sense is starting to feel a bit like deja-vu with Java packaging from the 90s: “yay! We have infinite flexibility in defining packages. Define all the packages and grab all of the good top level names before they’re gone!” :slight_smile: I think my instincts would feel better with the suggestion up-thread to not name them literally int and float – those are very generic and can’t be used as-is in C++ contexts (namespaces, etc).

If most everything is abbreviated to common, short abbreviations, I’m fine with that too, especially when those abbreviations are aligned with existing abbreviations (i.e. libc, or llvm), but having code with “float.from_ui” and “int.floor_div.signed” in the same design seems challenging to me.

I completely agree - consistency is really important.

I see math roughly as the equivalent of libm in C or import math in Python. It may contain things like cubic root and gamma function, but not something ubiquitous like integer addition. Perhaps we can find a better name for it.

Given that both int and float types are built-in, having cast operations in either of those dialects doesn’t actually create a dependency. What would create one is a canonicalization pattern that produces operations from another dialect, and it doesn’t look like sitofp could canonicalize to something. Folding is not an issue because it cannot produce new operations.

This is a good point. One of the reasons for getting rid of std was its “standard” status making it feel privileged over other dialects. Same reasoning may apply to int and float as “the only blessed way of working with integers”. stdint is a no-go for me for the same reason. Something like basic int or simple int or common int.

FWIW, I agree with @stellaraccident, @_sean_silva and @clattner on the points here. More specifically:

  • Either float depends on int or itof/ftoi conversions move to a more generic dialect.
  • Splitting int and float could lead to dialect dependency hell, or worse, implicit dependencies.
  • Having the int type being a builtin type but int operations as a dialect is very confusing, especially given that MLIR allows dialect types.
  • Verbose operation names don’t add clarity. Good language reference documentation does.

On the split, I’d probably go with a numeric dialect that encompass all supported scalar numeric types and operations, including float, int but not necessarily index, because that’s about element access and doesn’t have a fixed representation.

Adding to the dependency discussion, if create a lot of mini-dialects, depending on each others like crazy, we can still bundle them up with static initialisers like initializeBasicDialects, which would be the same as including compiler-rt for builtins and other stuff.

The few people that don’t want numeric types can do like the kernel does (similar to freestanding) and only include the dialects they need. Those people will pay a higher cost of dependency management but they really need it.

The plus side here is that we allow them to do so by splitting dialects and probably paying a smaller overall cost, leaving the high cost only to those that really need the complexity.

If we decide to go that route, I think it’s a reasonable cost balance.

FWIW, this is already the case for complex, memref, tensor and vector dialects. They all primarily operate on values of a builtin type. So I would argue that not following the same pattern is more confusing than following it. The builtin dialect, unlike standard, actually has special treatment: it is always loaded and needs not to be declared as dialect dependency of other dialects.

Integer arithmetic operations apply to index, the only special operation is index_cast. We wouldn’t want to duplicate numeric.addi and index.add, and all the others.

I don’t think that’s a good reason for doing it, just an explanation.

A better counter-argument would be: if basic numeric (scalar/complex) types are not built-in, then operations that cross types would need to depend on each other, and we’d go back having a single dialect.

But I that’s what I interpreted @stellaraccident’s argument to be. With tightly coupled concepts such as numeric types, arithmetic and maths, dividing into domains gets really tricky.

Having a builtin dialect helps, but what you put into the builtin dialect versus stand-alone dialects goes back to the same discussion.

We’re assuming indices are integral types with integral arithmetic on monotonic domains, but this is a simplification.

Strictly speaking, int.add and index.add don’t need to have the same semantics. We get away with it by saying “oh, the type is an index, so I’ll just do this instead”, but we’re essentially crossing the boundary.

When we only had one standard dialect, addi didn’t have any special semantics. But now int.add and index.add can have completely different semantics, for example boundary checks, address spaces, loop induction constraints, scalable vector sizes, etc.

Worse still, moving addi to int means we’d need to always include the int dialect if we ever wanted to increment the address of a variable…

1 Like

Not necessarily on each other. We are fine as long as dialect dependencies form a DAG. I agree with @_sean_silva that it is conceivable to have the float dialect depend on the int dialect. I have not yet made up my mind on whether it is desirable.

This is an instance of a larger layering problem we have come across in different contexts but never really addressed, by the way. The first instance is the conversion between dialects that needs to depend on both dialects, partially addressed by putting conversions in a separate library. Another instance is canonicalization patterns that apply to operations from different dialects, which are currently the major source of dependencies between dialects. Depending on another dialect because of using its types is another instance that we haven’t really hit in-tree until now (maybe somebody did out-of-tree).

It is tricky indeed, and there’s likely no ideal solution. I am trying to look at it from cost-of-using-a-library perspective. If I just needed to use integer types (LLVM dialect does that, for example), I probably wouldn’t mind to also load addi and maybe even ceildiv; I would likely mind to have to also load cyl_bessel_3 and the entire set of special functions that could technically apply to integers. Yet all of those could qualify as math.

One way to split this is to have types in a different dialect than operations. Then if somebody needs to only use the type, which seems to be more or less common, they can do so without extra baggage. If we discard the special treatment, this is what the builtin dialect currently gives us. It is unclear at which granularity we want to define the dialects though, having int_type and basic_int feels like too much overhead to me.

I agree with all of this. However, we don’t seem to have use cases for index operations with semantics that differs from that of integer types as of now. The moment when we do, we can consider creating an index dialect for that purpose. Until then, this sounds premature.

This depends on the scope of the int dialect. If it only has “basic arithmetic” like additions, multiplications, divisions and shifts, it is all likely reusable for address computation. If it also contains “less basic” functions, that indeed looks too heavy a dependency.

+1 on numeric from me – at least as something I prefer over splitting. Thinking through the analogy with comb, there is still something “software/llvm-esque” implied here that the name doesn’t convey. But we also don’t restrict folks from defining their own.

On the index front, I’m not sure I buy my own reasoning but here’s a try: it doesn’t hurt to have the set of numeric ops also able to operate on index under the assumption that if you are using them, you are already bought in to the type landscape which presumes they actually have an integral representation (there is nothing stopping someone else from defining a different island). This really seems an extension of the point that this new numeric dialect is not opinion free – but we’re having trouble encoding what it is into its name, in the universe of alternatives.

Also +1 UT on interpreting math more as “formulas” or “numerical compositions”.

1 Like

Agreed. Quoting @stellaraccident: +1 on interpreting math more as “formulas” or “numerical compositions”.

The problem is now I have types but can’t do much with it. What is the point of integers if I can’t add them, or vectors if I can’t insert elements into them? Do I add my own subset dialect operations on them? That’d be crazy.

Perhaps what I’m proposing isn’t to move up the types into separate dialects, but to split the standard dialect into its core components and leave only structural things like regions, basic blocks and branches in the builtin dialect.

If my dialect makes use of floats and ints, I include them. If it doesn’t make use of vectors or tensors, I don’t. I can even create my own list types, reusing the index/int/float types from a numeric dialect for structural construction, etc.

Fair enough.

I think we’re converging on numeric dialects being only basic operations. :slight_smile:

Now all we need is to define “basic” :slight_smile:

Yet this is exactly the approach taken for the tuple type - MLIR Rationale - MLIR. I can agree that integers are a bit more “core”. However, MLIR integer types can also be singed and unsigned, in addition to LLVM-esque signless, and we don’t have any “standard” operations on signed and unsigned integers either, not even casts.

Not having operations in the same dialect as type doesn’t mean the operations don’t exist. They may be in a different dialect. I see that as part of MLIR’s “dialect mixing” philosophy. You’ve got integer types. If you want, you can also get “basic integer” dialect to work on them according to its specification Or you can get the LLVM dialect to work on them with the same semantics as LLVM IR. Or you can define your own operations if neither suits (e.g., C++ AST where you may want to have UB specifically for signed integer overflow, unless if you are in C++20). Making all these options equally viable instead of “one blessed way to do integers” is what makes the idea of putting the type definition in a separate dialect appealing to me. I understand there are costs to it though.

Regions and blocks do not belong to dialects, they are fundamental unextendable IR concepts. (IIRC, Mehdi and I actually tried to define blocks as operations at some point.) Branch operations are not built-in and it is unclear if they should be.

We are splitting the standard dialect into components, we just need to figure out the correct granularity of those components. It looks like the options are to have “basic int” + “basic float” vs. a single “numeric” dialect and, maybe separately from this RFC, consider whether we want to keep the types in the builtin dialect.

Thanks for calling out the names. I was expecting that and I’m glad we quickly converged. I’m also personally in favour of shortening the names.

There seems to be a lot of controversy about splitting into two dialects, int and float.

Would it not then be beneficial for the dialect containing addi to be as small as possible? Both the int and the float dialects are to contain only “basic” operations on those types.

I don’t see a dependency problem with splitting into int and float dialects, even if the integer and floating point types were removed from the builtin dialect and added to their respective dialects. As @_sean_silva said, int will have no other dependencies (other than builtin), and float will at most depend only on int.

If float had a dependency on int, then including both would be the same as just including a combined numeric dialect. But having them split gives users the option to forgo the floating point parts. A numeric dialect combining both will always have this large subset of integer-specific operations that can be factored out, and I don’t see a compelling reason not to.

“All supported scalar numeric types and operations” is pretty broad, and in my mind risks numeric inheriting some of the same problems as std – that it will become bloated (especially as more patterns are added) and need to be split anyways because users don’t want a huge dependency just to use integers.

Not to mention that int.add and float.mul is nicer on the eyes than numeric.addi and numeric.mulf…

Sorry if I sounded like I was against the split, I am not. I just don’t like the idea that some types are part of a “standard” dialect and the basic operation on those types are not. I’m not saying either are wrong, by the way.

As long as the dependency is a simple DAG, it should be fine. Worse case you’re going to force some people to include more than they already do to save others to include less… But with smaller subdivisions, that is unlikely.

Agreed. :slight_smile:

I understand your sentiment and I can agree with it. Certainly, dismantling std would pave the way for pulling types out of builtin.