Under the current framework design, a dialect can have multiple implementations with different semantics/layouts without problems. For example, we can define ArithmDialectInterp1 and ArithmDialectInterp2 and the one who integrates the multiple dialect interpreters into a final interpreter executable can choose the implementation they want. If neither implementations fit their need, they can build the third implementation, say ArithmDialectInterp3, without problem. The de facto only happens if we decide to provide an interpreter executable in the MLIR tree and pick a particular implementation for that binary. But, an interpreter executable is not what I am proposing here nor in the prototype patch.
Even if a de facto representation forms naturally because a particular interpreter executable is quite popular and all new dialect wants to be linked into that executable, the interoperability can still be addressed with what I described earlier.
There is one case I haven’t handled yet: the same op has different semantics if they are placed in different region, for example:
%x = "arithm.add"(%arg0, %arg1)
%dev_ret = "some_dialect.run_on_accelerator"(%arg3, %arg4) {
^b0(%arg0, %arg1):
%t = "arithm.add"(%arg0, %arg1)
return %t
}
But anyway, this can be addressed in the interpreter. If this is the concern, I can implement it.
Your example here sounds more like a few ops that should only be registered into the interpreter in their private repo instead of a need for general foreign function interface.
This is not how it is designed currently. If a dialect doesn’t need or want an interpreter, they can just ignore it. So I don’t agree with your statement on maintenance cost will be spread across all dialects.
I agree that there will be maintenance cost for the dialects if we develop the reference implementation in-tree as well (e.g. need to update the APIs when operands or attributes change) and the cost has to be multiplied if there are multiple implementations. It is true that it is something to be justified.
Sort of. The main use case (for myself) is to run the ops with the given input operands/attributes and compare the outputs with the outputs generated by the compiled executable. The second use case is to compare the outputs of the ops before and after the transformation to see if the outputs are still the same or the differences between the outputs are small enough (e.g. float truncation, quantization, or any other emulation/approximation that expect differences, etc).