That makes sense and we’ve done something similar on an implementation. What if we were to phase in such an index dialect and phase out support for arbitrary arithmetic on index in the arith dialect?
With respect to the tensor baggage specifically, I think we should consider that something to be deprecated and removed. Because the semantics are so unclear and incomplete in those cases, I’m not aware of any practical uses of it (i.e. I’ve seen folks try and then realize that the actual very limited case modeled is of unusably limited utility).
One other thing that I’ve found tends to compose well is an op like index.sizeof. In some situations where the size of the index type can vary within partitioned parts of the program, such an op can be used to materialize things like dynamic allocation/copies/etc of arrays of index. It would be fixated to the actual target index size late in a lowering pipeline, but you also have the option to fixate it early if you know.
Our experience with hand-waving hasn’t been great. Different people will interpret what’s written in a different way, and then you’ve miscompilations.
I think it’s better to have some semantics that are clearly specified but not perfect than try to have the optimal semantics but not fully specified. You can always evolve the semantics (though it’s not trivial).
Following LLVM’s semantics has several benefits:
Lowering to LLVM is trivial.
LLVM supports several architectures, and the design of the current design already reflects the tradeoffs in supporting all such ISAs. In general, we want the lowest common denominator to avoid forcing certain architectures to have to use runtime checks to give more precise semantics that may not be needed.
My suggestion is to introduce poison and freeze in MLIR as standard concepts. And do not introduce undef (see below).
Yep, that matches the semantics of LLVM’s undef.
However, in the past few months we’ve been replacing all uses of undef with poison whenever appropriate. This as part of an effort to remove undef completely from LLVM, as just the existence of undef (even if unused) breaks certain optimizations.
Nevertheless, you can always lower SPIR-V’s OpUndef into freeze poison. That’s correct, and equivalent if freeze ends up with a single use only.
(this is not exploited by InstCombine yet, but it’s on the todo list)
Would poison and freeze go in arith or builtin? I can’t see the proposal for the index dialect yet, so can’t determine if it would be used there (seems like it would). Trying to parse how “standard” this is.
In current pipelines, arith lowers to SPIR-V, which seems the opposite of what is described here?
I don’t know much about MLIR, but probably builtin. You need poison for a lot of things, including as a placeholder for phi nodes, vector creation instructions (insertelement, shufflevector, etc in LLVM). freeze is needed for optimizations that hoist stuff outside of loops and/or branches and for bit-fields. So it may not be useful for everyone (or maybe no one right now).
Ah, that’s ok as well, as poison can be lowered into undef.
The other thing on my list are the removal of the ceildiv sorts of operators. These were added to arith as part of lowering from affine, and they are extremely expensive in terms of lowering and thus have completely different cost model from everything else in the arith dialect. IMO, these should be moved to the affine dialect or change affine lowering to generate more primitive arith ops.
I will say that there are theoretical reasons why it could make sense to have these operators somewhere in the system (e.g. range analysis could be used to lower them to cheaper operations) but none of that is implemented in tree AFAIK. Having these in arith is like adding einsum to HLO. I don’t think we would ever have accepted these into arith if they were proposed after arith was established.
Other random thing, why is arith in lib/Dialect/Arithmetic instead of lib/Dialect/Arith ?
I don’t quite understanding what is “domain specific” about these actually? They seem to be cross-domain and are actually generically defining “a model to manage UB in an optimizing compiler” as far as I understand.
The way I see it is that we need a model to manage UB in the compiler, and upstream we need this for transformations to be able to reason about this and operate safely, regardless of what model we adopt it has to have some central / core support: this seems to me as fundamental as unrealized_cast for example.
As was noted here, we could just shuttle these to the math dialect now, I think. That dialect shares the characteristics of multiple/complicated lowerings and cost models that are ascribed to these ops. Probably these just got sorted wrong in the original std dialect splitting.
I don’t see why any of this requires these things to be in the builtin dialect, though? Even if it was completely cross domain, we could have a dialect specialized around handling and reasoning about UB. I don’t see why builtin is a requirement for handling things in a “central/core” way. Feels like the same line of thought that led to the standard dialect.
Right. I suppose framed in another way, I’m not against a “blessed” path of handling these things. I’m more of picking on why the “builtin” dialect needs to be the place to house them. We can still have “core” things outside of there that we push as “the way to model/analyze/etc these things”.