[RFC] MLIR web related dialect proposal

This proposal is to setup javascript dialect and webassembly dialect. Whether it is a deep learning model or a general operation, there is a need to run on the browser side. Just like emitc dialect now, there is a strong need to design emitjs dialect and wasm dialect. So that mlir applications can run on browsers. It is necessary to have dialects for these scenarios. Programmers can translate other dialects into javascript to manipulate dom tree. In the future, they can also translate their own dialect into webassembly.

Although LLVM can now be compiled into js code, it is difficult to call dom and js related libraries from c. Additionally, compiler developers can easily write static analyzers using mlir-wasm. It is necessary to design these two languages separately. This is because wasm can’t operate DOM yet. And the native js runs on node.js interpreter. Native webassembly runs on wasmtime.

js dialect example workflow: ONNX model/C/DSL → MLIR xx-Dialect → MLIR js-Dialect → js (run on browser with DOM)

webassembly dialect example workflow: ONNX model/C/DSL → MLIR xx-Dialect → MLIR wasm-Dialect → wasm(run on browser with high speed)

1 Like

I am a bit confused from this description on what these dialects will look like (type system and operation set) and how they will play in the ecosystem. Maybe you could setup a presentation for an Open Meeting?

This is a good suggestion. A clear interface is necessary. Next step we will extract an expressive type system from these languages.

1 Like

Note there has also been progress on MLIR → wasm from the IREE side IANM @scotttodd @stellaraccident

1 Like

It seems that wasm in IREE relies on llvm backend, which can not directly represent in mlir.

Do you happen to have a concrete proposal or implementation? While MLIR can indeed model a great many things, I am somewhat skeptical that the span of use cases defined can be satisfied easily with a couple of dialects. Such things are usually better seen or prototyped.

If it were possible to define a wasm dialect in a similar way to SPIR-V (ie. A faithful representation focused on serialization), that could be interesting (but it’s also quite far from the full use cases you define).

Similarly, a js dialect focused on modeling the language could be interesting and as you note, the applicability to static analysis and such seems high. I think @jpienaar was working with some folks on something like this.

But this is quite different from an emitjs style dialect focused on code generation. For that, I think we would want to review how the work on emitc has gone and evaluate whether the approach needs adjustment.

These are just some of the questions we would want to talk through up front before beginning in tree development of such dialects. It also seems possible to me that this could represent a sizable scope increase to MLIR – not just a handful of loose dialects. Since most of these dialects represent entirely new use cases, etc, it might be more appropriate to explore them in the context of an incubator project and look to promote things to the main project as they solidify. See: LLVM Developer Policy — LLVM 16.0.0git documentation

In general, the bar for starting new things in the main project is intentionally high. I have observed that concrete, focused things with a high implementation quality and supporting community resonate more than described new directions without code or open ended mini projects. I’ll also note that this topic (adding new dialects) has become somewhat charged of late and I believe there is some appetite for discussing at the upcoming dev meeting, as it has produced several conflicting opinions about project direction and policies. The above represents my opinion and I know there are others. For myself, I advocate more strongly for inclusion of components which have been built and matured to some point where they can be concretely evaluated.

1 Like

Indeed not yet fully open source but they have made a lot of good progress. One of those roundtrips (JS → MLIR* → JS), and there are different considerations when doing so vs just analysis focus.

And also +1 to Stella’s other two suggestions here.

I think there is an ODM scheduled for this week so perhaps in a week or two (depending on availability).

I can elaborate first on how IREE is using MLIR to run machine learning programs on the web, using WebAssembly and WebGPU. While our work in IREE in this area is still early/experimental, we’re pretty confident in the general architecture. For WebAssembly, we lower through TF/JAX/PyTorch/etc. -> Linalg dialect -> LLVM dialect -> LLVM -> WebAssembly. For WebGPU, we lower through TF/JAX/PyTorch/etc. -> Linalg dialect -> SPIR-V dialect -> SPIR-V -> WGSL. In both cases we use a runtime written in C and compiled to WebAssembly via Emscripten to actually run the executable code in those compiled programs (WebAssembly modules, WGSL shaders). We’re using MLIR as a way to compile the dense math in our input machine learning programs down to Wasm/WGSL and then leaving the interfacing with browser APIs (and the DOM) up to the C/JS runtime.

For these proposals, I agree with the general feedback above about grounding the ideas more concretely with prototypes or sample use cases. There are all sorts of ways that dialects based on programming languages could be used. I think the SPIR-V and Emit-C dialects are good examples of dialects to look at for inspiration, but some of the ideas proposed (manipulating the DOM tree, converting from dialects into JavaScript) quickly increase the scope of such a project.

Now as for a webassembly dialect, I’d want to understand how such a dialect would fit in with the WebAssembly backend in LLVM (from both a technical and community perspective). Maybe there’s a loss of information and missed optimization opportunities in Linalg dialect -> LLVM dialect -> LLVM -> WebAssembly that we could get back by going directly from Linalg dialect -> WebAssembly dialect -> WebAssembly (substitute other dialects for Linalg as needed there), or maybe the WebAssembly backend in LLVM would eventually want to use parts of MLIR infrastructure. In either case, I’d want to hear from developers who work on the LLVM WebAssembly backend and/or see some more concrete prototypes/proposals before committing to work in upstream MLIR.

A javascript dialect is even less clear to me. A specific case like emitjs (parallel to emitc) I could see fitting in upstream in some way. Anything more general than that spooks me … DOM manipulation, browser APIs, etc. all cover a very large and complex surface area.

Do you have more details or prototypes about the wasm dialect?