Bytecode dialect versioning today is built around the premise that MLIR bytecode is used as a long-term storage format: specifically, that the producing process has knowledge about the version of dialect to target, and is able to downgrade the dialect prior to serialization. If it cannot, this is an error at serialization time.
This poses a problem if bytecode is used as an exchange format between distributed compilers with different versions of the dialect. If a given module references an operation that the receiving process doesn’t know about, or utilizes a newer version of the op that the receiving process doesn’t have support for, the receiving process will fail to parse the bytecode in its entirety, regardless of whether or not the failing operation is relevant to the receiving process.
This proposal adds a fallback mechanism for dialects to construct an operation that maintains the semantics of the unknown operation, while supporting roundtrip bytecode serialization in a bitwise exact manner. Specifically, the flow changes to the following:
- Attempt to deserialize bytecode using the standard dialect/op interface mechanisms.
- If an op cannot be parsed, attempt to parse it again using the dialect fallback mechanism. This uses a sideband path to register the original operation’s name with the fallback op, and requires that properties are encoded using a preset dialect-wide properties encoding scheme. The fallback op has flexible semantics to allow it to represent any op in the dialect.
- When round-tripping back to bytecode, the numbering scheme uses the sideband path to reference the original operation’s name, and use that in-place of the fallback op. Properties are serialized according to the dialect’s standard encoding scheme.
Existing options considered:
- Unregistered operations: this does not work if the operation is registered on the sending side, but not on the receiving side. Unregistered operations require that properties are serialized as attributes, which breaks down if the sending side uses custom properties encoding for versioning. The registration state is also serialized into the bytecode, so it will be incorrect on the receiving side.
- Passing through the properties section as-is: the original version of this patch envisioned a fallback interface where the properties blob was read directly from the bytecode and then serialized back as-is during the roundtrip, eliminating the need for a standard property encoding scheme between dialect ops. Unfortunately, the numbering scheme in the round tripped bytecode doesn’t match up with the numbers encoded in the attribute and type sections, because we’re not able to number the attributes and types stored in the opaque properties blob.
- Downgrade on serialization: this adds significant overhead. It requires negotiating the minimum supported dialect version between the receiving processes, which may result in the semantics of the program changing significantly to accommodate. Furthermore, it does not solve the problem if the module cannot be downgraded. Alternatively, the serialization process could convert newer ops to opaque fallback ops at serialization time, but that would still require determining the dialect version of the receiving process.