Representing Anonymous Functions (lambdas)

Hi,

I am experimenting with MLIR to see if I can use it for an experimental compiler for a high level language. So for starters, I have been extending the toy compiler with different features. One that’s leaving me stumped is how to represent anonymous function.

On the one hand anonymous functions are generally first-class values in many languages, and therefore feel like they should be defined through a MLIR Op. On the other hand, the number of arguments and return values are only really known at runtime, and therefore one can’t at compile time write the ins and outs of the Op for it.

The only solution I see is to define a type like a list and use that for the LambdaOp, receiving a list of values as an argument and returning a list of values. Is this the direction usually taken to represent these things in MLIR?

Is there any live code somewhere representing common high level language constructs I can take some inspiration from?

Thanks,

Paulo

Ops are not the only first-class things in MLIR. Lambdas-as-ops make little sense to me because that at
least partially defies their purpose – being used as values. I would rather model anonymous functions as values of function type, which is more consistent with the functional paradigm IMO. This also naturally supports higher-order functions.

As a matter of fact, MLIR already supports values of built-in function type. AFAIK, the only way to produce them in-tree is to have a constant referring to a global function. So the actual missing piece is the op that defines an anonymous function. Something like

%0 = func.lambda {
  // body here
} (!arg.type, !arg.type2) -> !result.type

that defines a value of type (!arg.type, !arg.type2) -> !result.type feels like it would just work. It can already be called using std.call_indirect %0, %arg0, %arg1. Some fancier syntax that names the entry block arguments is probably desired, but otherwise it’s just an op with a region, no operands, and a single result.

The harder question is modeling value captures (I would tend to make them explicit, but no strong opinion), but that seems outside of the scope of the original question.

2 Likes

Hi @ftynse ,

Thank you for the reply. I am still learning MLIR, so early days for me and many of the things you say sound about right although many of the concepts you mention like “MLIR supporting values of built-in function type” and how to use “regions” still elude me.

For value capture, I guess my idea was to be able to create closures so that lambdas capture the values in the context in which it is defined, a la scheme. However, this feels like something I shouldn’t worry too much about while learning MLIR and it one step at a time.

Have you seen an implementation of this kind of anonymous functions in the wild anywhere?

Thanks,

Paulo

Here’s a simple example with valid IR (even lowers to LLVM):

module  {
  func @foo() {
    return
  }
  func @bar() {
    %f = std.constant @foo : () -> ()
    return
  }
  func @baz(%arg0: () -> ()) {
    std.call_indirect %arg0() : () -> ()
    return
  }
}

In this example %f is a value, similarly to how an integer “42” would be, but its type is “function with no arguments and no results”. The function type belongs to the built-in dialect. Other dialects may define their own function-like types.

Regions are essentially generalization of function/module/loop/conditional bodies. Any operation may have them. See MLIR Language Reference - MLIR.

Scheme has a garbage collector and the overall object model that makes it easy to manage the lifetime of the objects captured by closures. MLIR has neither (and is not supposed to, at least not as built-in concepts) so whoever implements a “functional dialect” will need to design those, and it is a non-trivial problem.

RISE (rise-lang.org) is a functional language that experimented with an MLIR dialect representation - https://github.com/rise-lang/mlir-doc.