Although LLVM can now be compiled into js code, it is difficult to call dom and js related libraries from c. Additionally, compiler developers can easily write static analyzers using mlir-wasm. It is necessary to design these two languages separately. This is because wasm can’t operate DOM yet. And the native js runs on node.js interpreter. Native webassembly runs on wasmtime.
js dialect example workflow: ONNX model/C/DSL → MLIR xx-Dialect → MLIR js-Dialect → js (run on browser with DOM)
webassembly dialect example workflow: ONNX model/C/DSL → MLIR xx-Dialect → MLIR wasm-Dialect → wasm(run on browser with high speed)
I am a bit confused from this description on what these dialects will look like (type system and operation set) and how they will play in the ecosystem. Maybe you could setup a presentation for an Open Meeting?
Do you happen to have a concrete proposal or implementation? While MLIR can indeed model a great many things, I am somewhat skeptical that the span of use cases defined can be satisfied easily with a couple of dialects. Such things are usually better seen or prototyped.
If it were possible to define a wasm dialect in a similar way to SPIR-V (ie. A faithful representation focused on serialization), that could be interesting (but it’s also quite far from the full use cases you define).
Similarly, a js dialect focused on modeling the language could be interesting and as you note, the applicability to static analysis and such seems high. I think @jpienaar was working with some folks on something like this.
But this is quite different from an emitjs style dialect focused on code generation. For that, I think we would want to review how the work on emitc has gone and evaluate whether the approach needs adjustment.
These are just some of the questions we would want to talk through up front before beginning in tree development of such dialects. It also seems possible to me that this could represent a sizable scope increase to MLIR – not just a handful of loose dialects. Since most of these dialects represent entirely new use cases, etc, it might be more appropriate to explore them in the context of an incubator project and look to promote things to the main project as they solidify. See: LLVM Developer Policy — LLVM 16.0.0git documentation
In general, the bar for starting new things in the main project is intentionally high. I have observed that concrete, focused things with a high implementation quality and supporting community resonate more than described new directions without code or open ended mini projects. I’ll also note that this topic (adding new dialects) has become somewhat charged of late and I believe there is some appetite for discussing at the upcoming dev meeting, as it has produced several conflicting opinions about project direction and policies. The above represents my opinion and I know there are others. For myself, I advocate more strongly for inclusion of components which have been built and matured to some point where they can be concretely evaluated.
I can elaborate first on how IREE is using MLIR to run machine learning programs on the web, using WebAssembly and WebGPU. While our work in IREE in this area is still early/experimental, we’re pretty confident in the general architecture. For WebAssembly, we lower through TF/JAX/PyTorch/etc. -> Linalg dialect -> LLVM dialect -> LLVM -> WebAssembly. For WebGPU, we lower through TF/JAX/PyTorch/etc. -> Linalg dialect -> SPIR-V dialect -> SPIR-V -> WGSL. In both cases we use a runtime written in C and compiled to WebAssembly via Emscripten to actually run the executable code in those compiled programs (WebAssembly modules, WGSL shaders). We’re using MLIR as a way to compile the dense math in our input machine learning programs down to Wasm/WGSL and then leaving the interfacing with browser APIs (and the DOM) up to the C/JS runtime.
Now as for a webassembly dialect, I’d want to understand how such a dialect would fit in with the WebAssembly backend in LLVM (from both a technical and community perspective). Maybe there’s a loss of information and missed optimization opportunities in Linalg dialect -> LLVM dialect -> LLVM -> WebAssembly that we could get back by going directly from Linalg dialect -> WebAssembly dialect -> WebAssembly (substitute other dialects for Linalg as needed there), or maybe the WebAssembly backend in LLVM would eventually want to use parts of MLIR infrastructure. In either case, I’d want to hear from developers who work on the LLVM WebAssembly backend and/or see some more concrete prototypes/proposals before committing to work in upstream MLIR.