[RFC] A binary serialization format for MLIR

My opinion on this (for ML) is that both are appropriate and a real system (built on top of the IR/serialization) needs to accommodate both inline and out of line constants and initialization data. I’ve watched entire generations of tooling and systems come into existence where the only real difference is that they decided this point differently – and then forked the world.

There are valid cases for both – often within the lifecycle of the same chunk of code being compiled. Therefore I support the serialization and in memory representations solving for the inline case well. The higher level can always choose to implement more of a “linker” modeling where they are out of line, and I expect that we will pretty immediately want to look at that in some of the downstreams as a next step.

There are a number of sub optimal things about ElementsAttr (a few less with recent patches), and I’d like to spend some time on it or its successor to align it with the need you mention here. It is not really appropriate for the class of data today which should really have a “never copy” annotation on it.

I think we’ve converged on not using fb for this work, but I’ll just mention that we’ve stopped using the flatbuffers library for a lot of the reasons you are pointing to. When we need to use flatbuffers, we use flatcc as a minimal thing. The original, in my opinion has gone the way of protobuf: it is this big, complicated thing with all kinds of practical sharp edges on it. You end up having to organize your project around it. Usually when that happens is when I step away and find a simpler alternative.

I don’t know all the use cases for a binary serialization for MLIR, but having spent most of the last decade begging people to not use LLVM bitcode as an interchange format, I do maybe have some perspective to share.

The first warning is that if MLIR provides a binary serialization people will use it, and probably in ways it wasn’t intended to be used. The MLIR community could decide (as LLVM did) that those unsupported cases are unsupported no matter how numerous or prevalent they become. That path leads to a lot of (mostly) out-of-tree insanity to work with LLVM IR modules, some odd forks of old versions that live to long, and messy problems engineers like myself spend a lot of time thinking our way out of.

The second big advice I’d give is don’t bit-pack values. LLVM Bitcode focuses on being small rather than fast, but years later it actually misses that target. In some experiments I’ve run starting with two roughly equivalent GPU kernels, one targeting DXIL (which is LLVM bitcode), the other targeting SPIR-V (which is semantically similar to LLVM IR, but not bit-packed), the SPIR-V representations are rarely more than 10-20% larger than LLVM bitcode (and in some cases are smaller). Better yet, SPIR-V compresses well with off the shelf compression algorithms, which makes SPIR-V much smaller in practice.

I have nightmares of working with protobuf and flatbuffers too.

The big question I’m curious about is whether or not the goal is to produce a binary serialization of IR data structures (which is more or less what bitcode is for LLVM modules, and is the problem protbuf, flatbuffers, and a million other libraries solve), or if the goal is to produce a binary file format where the expected user may not be reading using the MLIR libraries. The later is more interesting to me with MLIR than LLVM IR because it could make it easier to support opaque dialects or runtimes and tools that only understand part of an MLIR module.

1 Like

This is an interesting question, funnily I’ve been thinking of non-MLIR consumers too. MLIR has few built-in concepts which will make the bytecode reasonably regular and so i think easier to interact with for such tools. But can’t say this is in expected category yet. I expect us to have some iterations here initially, so may be good to have such users to evaluate. Definitely using MLIR libraries and having them work well is goal, but it’s a good question as to utility. Do you have something specific in mind already? Some path where you’d prefer to use something external?

Indeed: this is the main drawback I found to the LLVM bitstream as well. Byte-aligned variable-size encoding schemes seem like a better trade-off, and there are many references for vectorization of the encoding/decoding, this presentation is a fairly recent one on the topic I think: https://people.csail.mit.edu/jshun/6886-s19/lectures/lecture19-1.pdf

@jpienaar I don’t think I had ever considered the possibility of someone reading an LLVM bitcode module with anything other than LLVM… but they do. In fact DXIL’s LLVM bitcode is not only consumed by non-LLVM consumers, there are non-LLVM compilers that generate it.

Ha yes, we had folks asking about parsing .mlir files with custom generated parsers quite early but was definitely not something we had considered for the textual format.

Note that even for the binary format the challenge of “parsing” will remain around attributes and types which don’t have a generic representation.

1 Like

That is good point, especially for builtin.

Another alternative might be to not bother with a stand alone binary format as such, and instead create a sort of pre-parsed binary format, with its sole purpose being quicker to read than the .mlir text.
I bit like pre-compiled headers in C.

For size considerations, I have noticed that LLVM .ll tends to compress better than LLVM .bc files, similar might happen for MLIR binary and text format.

For speed interacting with out of tree programs, maybe also consider a shared memory interface method for transfer of MLIR data to other apps.
This makes the daemon approach to compiling more efficient.

This is going to be pretty critical for our use cases. I’m hoping, though, that it is orthogonal to the actual format (i.e. so long as the format is self-contained relative to some base pointer, a separate mechanism can exist to map it across memory spaces).

1 Like