Open MLIR Meeting 12/15/2022: Embedded reactive programming in MLIR

This Thursday, December 15th (9am California Time, 17:00 UTC ; California is in winter time now FYI!), @dpotop will be presenting about Embedded reactive programming in MLIR.

Abstract:
Today’s machine learning (ML) frameworks (TensorFlow, PyTorch…) and compilers (MLIR, Glow…) allow the specification and efficient implementation of deep neural networks (DNNs) that have ever-increasing precision when applied to Big Data that is already stored in large databases. By contrast, in various applications, a clear need emerges for reactive ML applications that operate in a stateful fashion on streams of data, in time (in RNNs, attention-based approaches, reinforcement learning - RL - or even when scheduling convolutional networks in time to reduce memory requirements). Beyond this view focused on ML algorithmic design, the field of embedded ML, where ML components are placed in the feedback loop of real-time embedded control applications (along with data pre- and post-processing routines) is becoming both a reality and an industrial necessity. Existing ML frameworks and compilers cannot adequately handle the specification and the implementation of reactive aspects. The situation has reached a point where ML algorithmic innovation (which largely embraces approaches that process data in time, like in reinforcement learning - RL- or in transformers) is largely distinct from embedded ML engineering, the latter focusing on simpler (often stateless) network architectures. Ad hoc remedies are proposed in the literature for specific and limited cases. For instance, the time dimension of RL applications is traversed using (interpreted) Python code unsuitable for an embedded implementation. Modifications to the Python front-end also provide the solution to streaming applications (a sub-case of general-purpose reactiveness).

To bridge between ML algorithmic research and reactive embedded implementation, instead of the existing ad hoc workarounds, we propose the direct integration of general-purpose reactiveness into the specification formalisms and compilers of ML frameworks. We have integrated low-level (imperative) and high-level (dataflow) synchronous reactive programming into MLIR. We first recall commonalities between dataflow synchronous languages and the static single assignment (SSA) form of general-purpose/ML compilers. We highlight the key mechanisms of synchronous languages that SSA does not cover—denotational concepts such as synchronizing computations with an external time base, cyclic and reactive I/O, as well as the operational notions of relaxing control flow dominance and the modeling of absent values. We discover that initialization-related static analyses and code generation aspects can be fully decoupled from other aspects of synchronous semantics such as memory management and causality analysis, the latter being covered by existing dominance-based algorithms of SSA-form compilers. We show how the SSA form can be seamlessly extended to enable all SSA-based transformations and optimizations on reactive programs with synchronous concurrency. We derive a compilation flow suitable for both high-performance and reactive aspects of a control application, by embedding the Lustre dataflow synchronous language into the SSA-based MLIR/LLVM compiler infrastructure. This allows the modeling of signal processing and deep neural network inference in the (closed) loop of feedback-directed control systems. Performance is not affected by the use of reactive modeling.

Zoom Meeting Link

Meeting ID: 851 5109 0498
Passcode: 828404

As usual, this will be recorded and posted on YouTube.

Hello everybody,

As it happened, I did not leave time for questions as the presentation was quite dense.

Please, do not hesitate to contact us:
dumitru.potop@inria.fr
hugo.pompougnac@inria.fr
albertcohen@google.com

We would like to organize a demo of our MLIR extension. The publicly available version (link in slide 11 is a few months old w.r.t. what we can demonstrate. New features include, in particular, an iree-based back-end.

I would suggest that our work has two aspects that can be discussed separately:

  • The high-level dataflow dialect. This one is quite mature, and we would like to demonstrate it and promote it for upstreaming.
  • The low-level reactive dialect. The problems we identified are real, but it may be interesting to first discuss here about opportunities to federate with other low-level dialects describing operational mechanisms.

Best regards,
Dumitru

@albertcohen @qaco

1 Like

And here are slides and recording for the talk!