The index Dialect
Overview
This RFC proposes the addition of an index dialect to MLIR.
The Index dialect contains operations for manipulating index values. The index type models target-specific values of pointer width, like intptr_t. The operations in this dialect operate exclusively on scalar index types. The dialect and its operations treat the index type as signless and contains signed and unsigned versions of certain operations where the distinction is meaningful. In particular, the operations and transformations are careful to be aware of the target-independent-ness of the index type, such as when folding.
This dialect contains a subset of the operations in the arith dialect, including binary arithmetic and comparison operations but excluding bitwise operations, with a few notable differences.
castsandcastuare used to convert between index type and builtin integer typesindex.bool.constantis used to materializei1constants that result from foldingindex.cmp- @stellaraccident also proposed that the dialect include an
index.sizeof, which returns the size of the index type for the current target
Most importantly, operations are only folded when the results would be the same on 32-bit and 64-bit targets. In short, operations are only folded when
trunc(f(a, b)) = f(trunc(a), trunc(b))
Some ops, like add and mul, satisfy this property for all values of a and b. They can always be folded. Other ops are checked on a case-by-case basis. When materializing target-specific code, constants just need to be truncated as appropriate.
Motivation
The standard way of manipulating index types in MLIR is with the arith dialect. This is deficient for a variety of reasons
- The
indextype has key differences from fixed-width integer and floating point types which are not handled by the arith dialect. The most important of these is thatindexis folded with 64-bit arithmetic, which can result in miscompiles on 32-bit targets for ops likedivu. - The
arithdialect brings in a bunch of potentially unnecessary dependencies, like operations on multidimensional vectors and tensors. index_castalways treatsindexas signed when extending (trivial to remedy but I thought I’d mention it anyways)
It’s not clear that issue 1 should be solved by casing each op folder on whether the operand types are index. Given issue 2 (why do I need so many dependencies just to add index values?), the best solution felt like introducing a new dialect with well-defined goals and semantics.
Full Op List
addsubmuldivsdivuceildivsceildivufloordivs(floordivu=divu)remsremumaxsmaxucastscastucmpwith eq, ne, slt, sle, sgt, sge, ult, ule, ugt, ugeconstantbool.constantsizeof
ceil/floordivs/u were included for historical reasons (affine), but they are much heavier than the other ops in the dialect and @clattner is in favour of not including them.
Any other ideas? ![]()
Why index.bool.constant?
The dialect needs to materialize i1 constants for folding index.cmp. Given that one of the main goals of the dialect is to provide a low-dependency dialect for manipulating index types, using arith.constant was not an option. Constant ops all have the same semantics, more or less, so from the perspective of op folding, it doesn’t matter whether an i1 constant comes from arith.constant or index.bool.constant.
MLIR Dependencies
Just MLIRIR and some interfaces.
Not Part of the RFC
What specifically to do with arith dialect. The inclusion of this dialect does not need to prescribe a particular fate to the arith dialect (like removing index types from its operations), but that can be discussed as part of the RFC.
Current Status
The dialect exists with lowerings to LLVM.