MLIR Dialect to represent Trees

hannessolo · October 26, 2020, 8:22pm

Hi

As a learning experience, I’m trying to represent lisp-esque expressions as a MLIR dialect.

For example, the type of expression I’m working with could be:

( + 1 counter )

or

(map 
  (lambda (x) (+ x 1)) 
  (1 2 3 4 5)
)

If I were creating a custom IR for this type of expression, a tree would be the logical choice before lowering to SSA form. But since the point of MLIR is that we don’t need a custom IR, I’m wondering whether I can represent trees in a “nice to work with” manner in MLIR.

My approach so far has been to create three operations in my dialect:

A list, which just has a region (graph region, not CFG) with a single block, where I add all the children of the list, and returns nothing
Symbols, to represent functions/variables/etc, which return nothing
Constants, to represent number literals

In addition, I also had to add a terminator which I call “end”, that is a total no-op.

This leads to the following representation in MLIR (for ( + 1 counter )):

module {
  "lisp.list"() ( {
    "lisp.symbol"() {name = "+"} : () -> ()
    "lisp.constant"() {value = 1 : i32} : () -> ()
    "lisp.symbol"() {name =  "counter"} : () -> ()
    "lisp.end"() : () -> ()
  }) : () -> ()
}

However, I’m not sure that this is a good approach for the following reasons:

For one, it is still necessary to add a terminator to every block, which leads to an awkward “end” symbol being necessary.
The second thing I don’t like about this is that I’m missing out on the SSA representing the flow of data - all the operations are “void” operations.
The third concern I have is performance of optimisation passes over this. Is it a problem if I have very deeply nested regions? Was MLIR built expecting this, or is nesting them like this an abuse of regions?

What are your opinions on this? Is this a good way to represent trees in MLIR? Is there a generally accepted way to solve this problem?

I’m still very new to MLIR, and any help or advice would be greatly appreciated!

stephenneuendorffer · October 28, 2020, 4:59pm

In simple cases like this you can use Traits - MLIR and elide the terminator.
This is exactly the right thinking… how do we represent what you want to represent? The great thing about MLIR is that at different points in a tool you can represent the same behavior in different ways, so you don’t have to focus early on in selecting the ‘right’ representation, because this will always have some tradeoffs. What you’ve shown is highly syntactic, making it easy to generate from the parse of your lisp expressions. How would this representation get transformed into a different representation that represents the flow of data? What would the result of that transformation look like?
Regions are fundamental in MLIR for representing hierarchy. Traversing hierarchy is similar in cost to traversing from one operation to another within the same region.

hannessolo · October 28, 2020, 6:03pm

Thank you, your reply is extremely helpful.

I didn’t know about that, and seems perfect for this use case
My concern is that working with my dialect I might have to “hack” some things because the framework isn’t really expecting an IR structured in this way. For instance, I tried writing a canonicalisation pass on + symbols that does constant folding (ie (+ 1 2) becomes (3)). I struggled to find a straightforward way to access the constants 1 and 2 using the pattern rewriter, because MLIR expects any inputs to + to be “wired” in as actual inputs to the operation - where I’ve just implicitly got them next to one another. Perhaps this means, as you say, that I don’t have a good representation for what I’m trying to do, and should lower to something else before constant folding.
That’s good to know!

In any case, thank you for your reply and I will continue playing around with this!

hannessolo · October 28, 2020, 6:29pm

About SingleBlockImplicitTerminator - Unfortunately I only seem to be able to add a single Operation type as an implicit terminator. Eg defining:

def ListOp: LispDialect_Op<"list", [DeclareOpInterfaceMethods<RegionKindInterface>, SingleBlockImplicitTerminator<"SymbolOp">, SingleBlockImplicitTerminator<"ConstantOp">]>

the verifier will always either complain like this:

error: 'lisp.list' op expects regions to end with 'lisp.constant', found 'lisp.symbol'

I’d like to have every operation implicitly terminate a block.

Do you know any workaround to this?

mehdi_amini · October 29, 2020, 5:06am

You can mark each of the op as being a terminator.
(but they would have to terminate blocks)

stephenneuendorffer · October 31, 2020, 5:01am

The easiest thing is to just have a dummy ‘end’ operation. This is the ‘implicit’ terminator with no successors.

hannessolo · November 2, 2020, 8:57am

Yes, unfortunately that won’t work for me here (I need operations to optionally terminate). I’ll just continue using the dummy terminator, like @stephenneuendorffer says.

Topic		Replies	Views
Is there a tree representation for MLIR code? Beginners	5	635	September 8, 2020
Modelling local scopes with a high-level dialect MLIR	4	362	December 3, 2020
Representing Anonymous Functions (lambdas) MLIR	3	923	December 9, 2021
Guidance Needed: Designing Performance-Oriented IR with MLIR for a New Compiler Project Beginners llvm , mlir	0	75	November 22, 2024
Structured Control Flow is Not Necessarily Regions MLIR	18	1534	April 16, 2020

MLIR Dialect to represent Trees

Related topics