The index
Dialect
Overview
This RFC proposes the addition of an index
dialect to MLIR.
The Index dialect contains operations for manipulating index values. The index type models target-specific values of pointer width, like intptr_t
. The operations in this dialect operate exclusively on scalar index types. The dialect and its operations treat the index type as signless and contains signed and unsigned versions of certain operations where the distinction is meaningful. In particular, the operations and transformations are careful to be aware of the target-independent-ness of the index type, such as when folding.
This dialect contains a subset of the operations in the arith
dialect, including binary arithmetic and comparison operations but excluding bitwise operations, with a few notable differences.
-
casts
andcastu
are used to convert between index type and builtin integer types -
index.bool.constant
is used to materializei1
constants that result from foldingindex.cmp
-
@stellaraccident also proposed that the dialect include an
index.sizeof
, which returns the size of the index type for the current target
Most importantly, operations are only folded when the results would be the same on 32-bit and 64-bit targets. In short, operations are only folded when
trunc(f(a, b)) = f(trunc(a), trunc(b))
Some ops, like add
and mul
, satisfy this property for all values of a
and b
. They can always be folded. Other ops are checked on a case-by-case basis. When materializing target-specific code, constants just need to be truncated as appropriate.
Motivation
The standard way of manipulating index types in MLIR is with the arith
dialect. This is deficient for a variety of reasons
- The
index
type has key differences from fixed-width integer and floating point types which are not handled by the arith dialect. The most important of these is thatindex
is folded with 64-bit arithmetic, which can result in miscompiles on 32-bit targets for ops likedivu
. - The
arith
dialect brings in a bunch of potentially unnecessary dependencies, like operations on multidimensional vectors and tensors. -
index_cast
always treatsindex
as signed when extending (trivial to remedy but I thought I’d mention it anyways)
It’s not clear that issue 1 should be solved by casing each op folder on whether the operand types are index. Given issue 2 (why do I need so many dependencies just to add index
values?), the best solution felt like introducing a new dialect with well-defined goals and semantics.
Full Op List
add
sub
mul
divs
divu
ceildivs
ceildivu
-
floordivs
(floordivu
=divu
) rems
remu
maxs
maxu
casts
castu
-
cmp
with eq, ne, slt, sle, sgt, sge, ult, ule, ugt, uge constant
bool.constant
sizeof
ceil/floordivs/u were included for historical reasons (affine), but they are much heavier than the other ops in the dialect and @clattner is in favour of not including them.
Any other ideas?
Why index.bool.constant
?
The dialect needs to materialize i1
constants for folding index.cmp
. Given that one of the main goals of the dialect is to provide a low-dependency dialect for manipulating index types, using arith.constant
was not an option. Constant ops all have the same semantics, more or less, so from the perspective of op folding, it doesn’t matter whether an i1
constant comes from arith.constant
or index.bool.constant
.
MLIR Dependencies
Just MLIRIR
and some interfaces.
Not Part of the RFC
What specifically to do with arith
dialect. The inclusion of this dialect does not need to prescribe a particular fate to the arith
dialect (like removing index types from its operations), but that can be discussed as part of the RFC.
Current Status
The dialect exists with lowerings to LLVM.