MLL: An extensible front-end for MLIR!

MLL: A extensible front-end for MLIR

We have been working on a simple yet powerful way to generate MLIR from a python like language. The goal is to build a minimal frotend which lowers to MLIR with ease.

The idea is to build an extensible AST which maps to one or more MLIR dialect operations. Developer can build an MLL “Dialect” containing custom Expressions and Statements in tablgen (.td) format and then map it to MLIR operations.

In the open-sourced github implementation (mll/mll at mll · imv1990/mll · GitHub), we have 4 dialects to showcase various use cases of such a frontend (Builtin, OpenMP, GPU and Vector). The same idea can be extended to almost any MLIR dialect (LLVM, HLO dialects, TOSA, Affine, tensor, spir-V, ONNX, TF, etc.). We use MLIR JIT utilities to execute the lit tests.

Example 1 : 2-D matrix sum

a = array<3*3*i32>.dense(10)
b = array<3*3*i32>.dense(20)

c = a + b

print("sum = ", c)

Example 2: A simple 2-D loop

x = array<4*4*i32>.dense(0)

for i in range(4) {
  for j in range(4) {
    x[i, j] = i + j
  }
}

print(x)

Core language features:

Custom type system:

Create any custom type and define constant expressions for the same.
Example 3 :

a = array<10*20*i32>.dense(0)

Creates a 2-D array of size (10, 20) initialized with all zeros.

Similarly you can create types like Struct, class, vector etc and sugar coat it with nice syntax.

Example of vector types: mll/vector_print.mll at mll · imv1990/mll · GitHub

Custom Statements and Heterogeneous language:

Launch workloads in GPU using gpu.launch statement! This maps to gpu.launch operation in MLIR

Example 4: Launch GPU operations

a = array<10*i32>.dense(1)

gpu.host_register(a)

gpu.launch blocks=(1, 1, 1) threads=(10, 1, 1) {
 a[threadIdx] = 10
}

lit test: mll/gpu.mll at mll · imv1990/mll · GitHub

Launch parallel CPU threads with OpenMP dialect!

Example 5: Launch Multiple threads!

a = array<10*10*i32>.dense(1)

for i in range(10) {
  omp.parallel {
    for j in range(10) {
      a[i,j] =  100
    }
    omp.critical {
      print("i = ",i)
    }
  }
}

Execution tests: mll/omp_critical.mll at mll · imv1990/mll · GitHub

Similarly, one can build a new MLL dialect for a custom hardware and use MLL as frontend

Semantic Checks:

Basic semantic checks for statements like functions etc are supported while parsing. There is no generic framework built for semantic checks yet. But it looks like most semantic checks can be in tablgen based format.

Extensibility:

One can easily plugin a new dialect with .td files and parsing logic in C++. For example, for basic function support, the C++ code is around 200 lines including parsing and MLIR conversion.

Code Generation:

Dialects implemented have lowering to MLIR.But the AST representation has no dependency on MLIR dialects. One can use it for alternate code generations (like LLVM, C, etc).

Next steps:

  1. Solidify the language design
  2. Build more dialects containing TensorFlow / Pytorch / Numpy and provide capability to build and run ML models.
  3. The frontend implementation is in pathetic state. Refactor / rewrite and try to use an MLL dialect itself to simplify the Dialect declaration.
  4. Build IREE backend
3 Likes