[RFC] `index` dialect

Just to call it out (but not necessarily endorse it), it does seem like a viable alternative is to “fix” the arith dialect:

  1. Special case arith folders on whether the type is fixed width.
  2. Remove the tensor/vector support from the arith ops (you say “unnecessary dependencies, like…”: are there others?).
  3. Fix index_cast.

As has been discussed multiple times for #2, I think the inclusion of tensor as a legal type for arith is a bug. I think this same reasoning may apply to vector but I have not done the analysis.

Even if we agreed to “fix” arith in this way, I remember it was a split decision back in the std dialect splitting days as to whether we wanted a dedicated index dialect, and we’ve had other cases crop up. I would buy that we just decided this wrong then and should correct it. The unspecified length nature of index comes up in a lot of cases and keeping it it’s own thing can help isolate that without a lot of sketchy predicates and such (I think).

I think it would be good to enunciate a recommended fate of index ops in the arith dialect (i.e. that we should remove that support but not prescribe when/how this is done). People will ask “future us” about this a year from now so might as well write it down.

+1. While this RFC should avoid getting bogged down with discussions of the when/how, I do think we should decide on an “official recommendation” for subsequent RFCs to follow up on. At the very least, having such a recommendation in mind should help to focus the discussion about whether adding the index dialect is the right move and about what exactly the dialect should look like.

+1 I was trying to word it in a way to suggest that this RFC isn’t to decided on what to do, but what could be done is within the scope of discussion.

This is the main reason why it makes sense to have a separate dialect. The differences between index and the fixed-width integer types could be considered “management enough” that handling them can be munged into the arith dialect, but I don’t think this is a good direction.

I’m in favour of doing this.

Are these ops only useful for fixed-width types? It would not make sense to make math work on index types.

+1

+1, I agree that we as a project should have a plan here. We shouldn’t just “take a dialect” and assume someone else decides what is right overall for the project later. We should be more deliberate about that.

I also don’t have a well formed opinion on vector types in arith. The harm is so much lower than the harm of tensors that I wouldn’t advocate for removing them. The only challenging question that comes up is what to do with comparisons: should you make compare of two vectors return a bool or a vector of bools, or a vector of integer values that have the sext of the bool value?

In any case, I’d recommend deferring that to be another discussion.

This is actually the core observation that (I believe) argues for the index dialect to be split out. It is materially different in nature from the rest of the arith dialect in a bunch of ways, including that it doesn’t allow you to use getIntOrFloatBitWidth() to reason about the behavior of conversions etc. Splitting the “unknown width” stuff out to its own world seems more principled and allows putting nicer accessors on arith. It also clarifies the behavior of a number of corner cases like arith.constant.

There is also another layering bonus in that SCF depends on “index” but not on arith, which is nice given that arith is a very large dependency.

-Chris

I tend to agree with this interpretation. Just wanted to make sure we called out the alternative (and others might have a different read).

This comment is actually very timely! The vector masking proposal that we shared recently includes a new Vector Predication dialect that will provide Arith-like operations with native support for vector masking and dynamic vector lengths. This is in line with the Vector Predication proposal that landed in LLVM with the same motivation. I think the introduction of the Vector Predication dialect would be a good time to remove the vector support from Arith. Other than masking and dynamic vector lengths, which are enough reasons by themselves, I wouldn’t say there is a strong reason to support the split for vectors, IMO. I think LLVM IR has been doing relatively well on that front but I might be missing something.

Having said that, I’m curious that nobody is worried about the level of duplication that this will bring. We would end up with multiple copies of almost identical operations in different dialects. This duplication would have a cost at quite a few levels: maintenance, compiler build time, compile time (dialect conversion) and code duplication (rewrite patterns). Where do you think the limit is? Should we follow the same approach for other vectorizable operations in, let’s say, the Math dialect?

1 Like

Having vector.add (multidimensional vectors), tensor.add, index.add, and arith.add (scalar + 1D vectors) is not what I would consider to be “duplication”. These are seemingly the “same” operation but they operate on types with very different semantics, and consequently the codegen for these ops can vary wildly.

While I agree with tensor being special and this RFC is convincing me about index, I’m not yet convinced on vector. Let’s discuss more.

(In general, I think we should use this RFC as a forcing function to arrive at an opinion about path forward for these things since in their current conception, they are commingled)

I don’t have direct need for this at the moment [1], but would it make sense to have a dedicated op indexing into buffers (think gep/LEA)?

%addr = index.address %base, %a x stride

That should be easier to look through than plain add + mul and avoid issues with overflows.


  1. And it’s always easier to ask than to do the work… ↩︎

There are potentially other “indexing” operations that could be interesting here (the op @cbate added comes to mind). Good question if those are target of this dialect too.

Agreed. Here is the thread. We were somewhat hunting for an index dialect when figuring out where to land them: [RFC] [Tensor] Extracting slices from `tensor.collapse_shape` - #8 by nicolasvasilache

Sorry, I’m not sure I follow why scalars and 1-D vectors should be part of arith and n-D vectors should be part of vector. Could you please elaborate? What about 0-D vectors?

I don’t think an operation with consistent and well defined semantics for different types and different lowerings or codegen transformations is a strong enough reason to introduce independent operations per type. The semantics of an operation shouldn’t be defined by its lowering but by what that operation represents at that specific level of abstraction. Based on that, at this point, I do not see any fundamental semantic differences among scalars, 1-D and n-D vectors when it comes to arithmetic operations. As I said, the introduction of masks and dynamic vector lengths on arithmetic operations will change that in the future.

I’m totally supportive of reducing dependencies between dialects but I think we should also take into account the cost of introducing a new operation. My concern about duplication is not only about the operation itself but about what comes with each new operation. I assume we want all the folding/canonicalization bells and whistles to be working for all the arith.*, index.*, vector.* arithmetic operations. What do we plan to about that? I would be worried if the answer to this question is to introduce an interface for every kind of arithmetic operation.

Just my two cents :slight_smile:

strong +1 for the dialect and motivation, this is a really nasty footgun, thanks for improving this!

There was also this prior discussion: [RFC] Arith dialect versions of affine.apply/min/max aiming at de-conflating affine considerations (related to ensuring a value evolves following an affine progression) from the ability to merely represent linear expressions on arbitrary index values in a compact and composable form.

However, I also see:

This suggests the index dialect does not aim at having such higher-level ops that are also subject to target-specific values of pointer width. We would need another dialect on top of that to represent such concepts; is this a correct assumption?

This was also my question, with a slightly different focus: do we want the index dialect to be “only simple operations on unknown-width integer types” or do we want it to be “operations necessary to do address/size computations”? IMO, the latter would be a better justification for the dialect to exist and not be seen as yet another copy of arithmetic ops (in addition to arith, llvm and some) but at the cost of greater complexity of the dialect itself (should we go as far as including datalayout here?).

SCF ops canonicalization depends on several other arith dialect ops like arith.select, arith.xori, arith.andi – none of these are in the list of @Mogball. If removing the SCF → Arith dependency is a key motivation, then the list will have to be complete.

2 Likes

+1 “Make everything as simple as possible, but not simpler.”

Can you elaborate on the transformations part?
I see foldings but what about other transforms and canonicalizations?
E.g. are we going to go towards supporting a subset of existing affine canonicalizations but on a lower-level IR or something more powerful?

Can you elaborate on the intended use cases for the dialect (beyond “don’t grossly miscompile”); in particular how is it meant to compose with other dialects?

One topic I am particularly interested in (related to @bondhugula’s point above):

  • is this expected to be the one true dialect that provides all loop-like constructs all the necessary operations on index quantities, in the fullness of time ? (in which case reviewing existing use cases and evolving/deprecating them will be necessary).
  • if not, how should one think about extensions such as the ones we previously identified (linked above)?

I am thinking that this is only part of the story.
When I think of indexes, how can one avoid pointer arithmetic.
Should there not also be a dialect for manipulating pointers, of which “index” arithmetic is a part?
It would make reasoning around pointer provenance easier.

This reasoning makes sense to me.

In this case, we have to duplicate some of the core logic and enhance it. Folding operations on index types requires a bit of extra work (to make sure the result is valid on all targets).

Not necessarily. I was just pointing out that there is controversy about the inclusion of these ops in index. I am not opposed to their inclusion because these ops are useful for index arithmetic, require the same care as other operations on index types, and it’s not clear where else they should go.

I think any operations that involve index types and that are useful to have as single ops (e.g. GEP/LEA like @kuhar suggested) and which don’t bring in heavy dependencies are within the scope of the dialect.

This is not a key motivation of the RFC, but we can think about it. I wouldn’t be against adding scalar boolean operations to the index dialect if it makes sense (it already has index.bool.constant after all). Conceptually, I don’t see why SCF has to depend on arith other than “because it has the ops SCF needs”.

Which are you referring to? This, I am not sure is within the scope of the dialect.

I don’t see why not, but to do this properly will require thinking hard about the layering of in-tree dialects. This would certainly be a longer-term goal.

I haven’t thought this through and don’t have any experience with this, but I’ll observe that bool is also a conceptually type than the rest of the arithmetic operations. We could split out a tiny bool dialect that has constant/and/or/xor that arith and index would depend on. I think that would solve the SCF dependency issue as well.

The complexity I see there is that you’d want the integer operations in arith to canonicalize to the bool dialect when the integers are i1, but I think that is actually a good thing: we want to canonicalize “add two i1s” to “xor” anyway, canonicalize “select between two i1’s” to logical operations etc. This sort of canonicalization is specific to i1 anyway, and an explicit bool dialect representation would make this explicit.

-Chris

I am talking about foldings, canonical forms and canonicalizations for linear expressions.

With the current opset proposed in the index dialect, such expressions are only expressible as SSA DAGs of binary ops. I am not sure what the state of the art is for how powerful we can make canonicalizations on such DAGs and how many patterns we would need in the fullness of time.

In contrast, a strawman operation like index.linear (i.e. affine.apply minus the AffineScope conflation) has significant advantages: the SSA DAG is expressible as type + n-ary operands and is closed under a bunch of linear operations. Many canonicalizations become foldings and can be createOrFold’ed. Going from a DAG if SSA values to a flattened multi-operand form simplifies both implementation and readability.

This further extends to strawman index.linearize / index.delinearize operations that we shouldn’t want to rediscover from low-level SSA DAG IR.

Now these operations are clearly higher-level than what is proposed here and may not be right for inclusion in this dialect. This seems to suggest a need for a higher-level dialect that operates on index types, which then circles back on @dcaballe 's point of duplication (both functionality and canonicalizations end up being duplicated as soon as we want powerful enough canonicalizations on linear expressions that we insist on representing as SSA DAG IR).

Separately, I’ll just throw a wacky idea here, have we thought about the possibility of having a finer granularity than dialects? These could look like dialect_name.subdialect_name.op_name.
OpDefinitions, impl etc could live in the same root Dialect directory (different files) and avoid circular dependences on things that make sense to consider together. Users could opt in or out of various subdialects and only the parts that are necessary are built. Subdialects must build and be tested in isolation, etc. This also brings annoying tradeoffs and burdens on the build infra, still, it shouldn’t hurt to ask.

1 Like