There exists a small amount of code (currently maintained as an external patch) which imports/exports from the Tosa reference model flatbuffer format to the corresponding dialect in MLIR. We would like to decide on the right place to put this tool and the form it should take and then contribute it to the codebase.
As a specification, Tosa provides a reference model and regression test suite for conforming programs. This model is built explicitly to be as-simple-as-possible, correct/not performant, and not for production use (now or in the future). It exists solely as a vehicle for verifying conformance with the spec, of which MLIR’s Tosa dialect and lowerings (TBD) are an implementation. It has been implemented with a Flatbuffer serialization interface, which is versioned with the specification.
We would like there to exist a tool to translate between programs in the MLIR Tosa dialect and the corresponding reference model serialization. Ideally, as an implementor, this tool would exist in the MLIR repository under a path like
mlir/tools/tosa-serializer. This would give us a convenient mechanism to:
- Import programs from the regression test suite for development and execution in the MLIR-based compiler (i.e. run with mlir-cpu-runner, etc).
- Export Tosa-dialect programs generated through other means (i.e. via the TFLite->Tosa importer) for evaluation under the reference model. This would form the basis for broader regression test tooling between systems (not maintained in MLIR).
This is not production code and should not become production code or a general purpose serialization mechanism for transporting program fragments for non testing purposes. The reference model itself has been written in a clarity/simplicity over performance manner, and this tooling should fall into the same category.
Ideally the specification and related reference model continue to treat LLVM/MLIR as a Tosa implementation and keep it at arm’s length. While such a serialization tool could go in the reference_model repository, it would invert the relationship, making the reference model have a dependency on the implementation that is meant to conform.
Practically as well, while the spec is versioned and expected to evolve in an explicit fashion over time, the corresponding MLIR dialect is not (by design). As MLIR evolves and IR constructs change, how Tosa is represented in MLIR will change in the details. As such, it presents a moving target, and it makes the most sense to co-locate the serialization tool with MLIR and version them together.
Presently, the tool is C++ code which uses MLIR’s C++ API to read a module/function containing Tosa ops and writes out a corresponding flatbuffer by using generated flatbuffer serialization code (plus the inverse). While it would be nice to not have to rewrite this, the implementation decisions here are not load bearing. Specifically, as a testing tool, this should bias towards simplicity (both of the code and of the integration), and a direct coupling of C++ APIs may be more engineering than we strictly want to maintain.
- Adapt the existing C++ code and land it into
mlir/tools/tosa-serializeras an optional tool. It would use
find_libraryto take an optional dep on the
flatbuffersC++ library. As part of this option, we would likely include a snapshot of
tosa.fbsor the generated code in the tools directory itself (vs taking a cross-repo dependency on the reference model).
- From the
tosa-serializertool, emit some lighter weight variant of the flatbuffer representation. After a fashion, flatbuffers does interop with JSON, as an example (albeit in a way that is not the most approachable).
- Rewrite the tool in Python, introducing no further hard dependencies. In this approach, the reference
tosa.fbsbuild upstream would be extended to also emit generated Python sources. We would snapshot the resulting
tosa_generated.pypython module into
mlir/tools/tosa-serializerand then implement
mlir-tosa-serializer.pyitself to just import this and use a
pip install flatbuffersavailable on the host system. The tool would use the MLIR Python API for building and reading the Tosa containing MLIR modules. For testing the tool itself, the corresponding lit directory could be enabled only if the Python
flatbuffersmodule was available.
Of these options, based on my experience, #2 has a lot of negatives and should be eliminated. #1 preserves the most existing code, but at the expense of cross-project C++ build/dep complexity. #3 would involve writing something new but would introduce the minimum coupling, in my opinion (and also has the side benefit of it “never” being confused with production code).
For completeness, there is also a possible #4: Standardize a binary serialization format and stable API for MLIR and get it to a point of maturity such that we would feel fine having the reference model introduce a dependency on LLVM for it. While I would generally love such a thing to exist, I think the level of production engineering involved is mis-aligned with the goals here. Even if such a thing exists, it would invert the relationship of the projects and I would bias towards the simplest possible thing for a testing case like this.
Opinions? We would like to make progress on this first thing in the new year.