Hi
As a learning experience, I’m trying to represent lisp-esque expressions as a MLIR dialect.
For example, the type of expression I’m working with could be:
( + 1 counter )
or
(map
(lambda (x) (+ x 1))
(1 2 3 4 5)
)
If I were creating a custom IR for this type of expression, a tree would be the logical choice before lowering to SSA form. But since the point of MLIR is that we don’t need a custom IR, I’m wondering whether I can represent trees in a “nice to work with” manner in MLIR.
My approach so far has been to create three operations in my dialect:
- A list, which just has a region (graph region, not CFG) with a single block, where I add all the children of the list, and returns nothing
- Symbols, to represent functions/variables/etc, which return nothing
- Constants, to represent number literals
In addition, I also had to add a terminator which I call “end”, that is a total no-op.
This leads to the following representation in MLIR (for ( + 1 counter )
):
module {
"lisp.list"() ( {
"lisp.symbol"() {name = "+"} : () -> ()
"lisp.constant"() {value = 1 : i32} : () -> ()
"lisp.symbol"() {name = "counter"} : () -> ()
"lisp.end"() : () -> ()
}) : () -> ()
}
However, I’m not sure that this is a good approach for the following reasons:
-
For one, it is still necessary to add a terminator to every block, which leads to an awkward “end” symbol being necessary.
-
The second thing I don’t like about this is that I’m missing out on the SSA representing the flow of data - all the operations are “void” operations.
-
The third concern I have is performance of optimisation passes over this. Is it a problem if I have very deeply nested regions? Was MLIR built expecting this, or is nesting them like this an abuse of regions?
What are your opinions on this? Is this a good way to represent trees in MLIR? Is there a generally accepted way to solve this problem?
I’m still very new to MLIR, and any help or advice would be greatly appreciated!