I think we can answer this question, at least partially, if we clarify the goal of the LLVM IR dialect itself. If it intends to model LLVM IR inside MLIR as truthfully as possible, than it should not accept anything but LLVM IR dialect types and have strictly the same semantics. If it is more of an abstraction that is similar to LLVM IR but intended only as a convenient lowering vehicle, we can allow for more flexibility.
If we go with the second option, it may be considered at odds with MLIR’s design philosophy that encourages all non-trivial processing to happen within MLIR so that translations can be trivial. This can be solved by having the translation accept only a trivially translatable subset of the LLVM IR dialect, e.g. it could be fine for the LLVM IR dialect to work on a subset of standard types as long as the translator doesn’t need to care this type conversion. In practice, we already have a “legalize for translation” pass, which inserts extra blocks if a predecessor of some block appears more than once (this cannot be expressed with PHI nodes otherwise), that can accommodate such preparatory conversions.
I have been consistently opposed to having LLVM IR dialect diverge from LLVM IR. My main reason is being able to keep the same semantics as LLVM IR as much as possible rather than redefine it again in MLIR. Otherwise, we would have to keep track of semantic changes in LLVM IR (and also intrinsics that are included in the dialect), in the built-in types and check/update it every time one of the two changes. Imagine if MLIR decided to remove signless integer semantics. Or, more realistically, how should we treat
add nsw nuw %0, %1 : index if we (a) don’t know the size of the
index type and (b) don’t have poison semantics for it in MLIR. That being said, I am not fundamentally opposed to such relaxation as I expect both built-in types and LLVM IR to be stable enough, as long as there is some way of checking that the semantics are still okay that doesn’t involve the dialect maintainer checking it over and over again.
In the meantime, since you mention development velocity, I would consider Jacques’ idea of having casts that we already have as
llvm.mlir.cast and/or type-cast-region operation and see what additional challenges does this pose. Today, the cast operation lives in the LLVM IR dialect. Arguably, it’s okay for it to depend on built-in types (“standard” types are not part of the standard dialect). It would not be okay for it to depend on the standard-to-llvm conversion, but that can be solved by factoring out the type conversion (which has nothing to do with the standard dialect) into a separate library. I can expect dependency issues in, e.g., canonicalization if we need to handle the
llvm.vscale wrapped by
llvm.mlir.casts in the SVE dialect canonicalization pattern, which would create a dependency from the SVE dialect to the LLVM IR dialect. This sounds like a problem that is not specific to the LLVM IR dialect but to the infrastructure in general. Some other potential problems?