I’d like to receive some feedback about a possible new dialect used to model stack-based languages, such as CIL, JVM and Ethereum bytecode.
Goal
Being able to convert to and from a stack-based language and MLIR’s SSA-based semantics.
New types
This dialect introduces one new type stack.stack
, used to represent an opaque value used to implement a monadic handling of the stack state.
New operations
This dialect introduces two new operations: stack.push (!stack.stack, T) -> !stack.stack
and stack.pop (!stack.stack) -> (!stack.stack, T)
. The semantics of these operations are what you would expect: they manipulate the stack they receive as input, and output the modified stack. They are Pure
operations.
Example
The following pseudocode implements Euler’s algorithm:
:start
dup
push 0
eq
cbr :end :loop
:loop
dup
rol3
mod
br :start
:end
pop
; a
ret
which can be translated into the following MLIR module:
builtin.module {
func.func @main(%0 : !stack.stack) -> !stack.stack {
%stk, %1 = "stack.pop"(%0) : (!stack.stack) -> (!stack.stack, i32)
%stk_1 = "stack.push"(%stk, %1) : (!stack.stack, i32) -> !stack.stack
%stk_2 = "stack.push"(%stk_1, %1) : (!stack.stack, i32) -> !stack.stack
%2 = arith.constant 0 : i32
%stk_3 = "stack.push"(%stk_2, %2) : (!stack.stack, i32) -> !stack.stack
%stk_4, %3 = "stack.pop"(%stk_3) : (!stack.stack) -> (!stack.stack, i32)
%stk_5, %4 = "stack.pop"(%stk_4) : (!stack.stack) -> (!stack.stack, i32)
%5 = arith.cmpi eq, %3, %4 : i32
%stk_6 = "stack.push"(%stk_5, %5) : (!stack.stack, i1) -> !stack.stack
%stk_7, %6 = "stack.pop"(%stk_6) : (!stack.stack) -> (!stack.stack, i1)
"cf.cond_br"(%6, %stk_7, %stk_7) [^0, ^1] <{"operandSegmentSizes" = array<i32: 1, 1, 1>}> : (i1, !stack.stack, !stack.stack) -> ()
^0(%7 : !stack.stack):
%stk_8, %8 = "stack.pop"(%7) : (!stack.stack) -> (!stack.stack, i32)
func.return %stk_8 : !stack.stack
^1(%9 : !stack.stack):
%stk_9, %10 = "stack.pop"(%9) : (!stack.stack) -> (!stack.stack, i32)
%stk_10 = "stack.push"(%stk_9, %10) : (!stack.stack, i32) -> !stack.stack
%stk_11 = "stack.push"(%stk_10, %10) : (!stack.stack, i32) -> !stack.stack
%stk_12, %11 = "stack.pop"(%stk_11) : (!stack.stack) -> (!stack.stack, i32)
%stk_13, %12 = "stack.pop"(%stk_12) : (!stack.stack) -> (!stack.stack, i32)
%stk_14, %13 = "stack.pop"(%stk_13) : (!stack.stack) -> (!stack.stack, i32)
%stk_15 = "stack.push"(%stk_14, %12) : (!stack.stack, i32) -> !stack.stack
%stk_16 = "stack.push"(%stk_15, %11) : (!stack.stack, i32) -> !stack.stack
%stk_17 = "stack.push"(%stk_16, %13) : (!stack.stack, i32) -> !stack.stack
%stk_18, %14 = "stack.pop"(%stk_17) : (!stack.stack) -> (!stack.stack, i32)
%stk_19, %15 = "stack.pop"(%stk_18) : (!stack.stack) -> (!stack.stack, i32)
%16 = arith.remui %14, %15 : i32
%stk_20 = "stack.push"(%stk_19, %16) : (!stack.stack, i32) -> !stack.stack
"cf.br"(%stk_20) [^2] : (!stack.stack) -> ()
}
}
which is then transformed into the following:
builtin.module {
func.func @main(%0 : !stack.stack, %17 : i32, %18 : i32) -> (!stack.stack, i32) {
%2 = arith.constant 0 : i32
%5 = arith.cmpi eq, %2, %18 : i32
"cf.cond_br"(%5, %0, %17, %18, %0, %17, %18) [^0, ^1] <{"operandSegmentSizes" = array<i32: 1, 3, 3>}> : (i1, !stack.stack, i32, i32, !stack.stack, i32, i32) -> ()
^0(%7 : !stack.stack, %19 : i32, %20 : i32):
func.return %7, %19 : !stack.stack, i32
^1(%9 : !stack.stack, %21 : i32, %22 : i32):
%16 = arith.remui %21, %22 : i32
"cf.br"(%9, %22, %16) [^2] : (!stack.stack, i32, i32) -> ()
}
}
Interactions with other dialects
I currently only have a PoC implementation (in xDSL) of some transformations that interact with the cf
dialect. When these transformations are applied repeatedly in the correct order, a fixpoint can be found of a program that is completely rewritten in SSA form.
Push-Pop folding:
%stk1 = "stack.push"(%stk, %x) : (!stack.stack, T) -> !stack.stack
%stk2, %y = "stack.pop"(%stk1) : (!stack.stack) -> (!stack.stack, T)
foo(%stk2, %y)
can be rewritten as
foo(%stk, %x)
Common subexpression elimination
%stk1, %x = "stack.pop"(%stk) : (!stack.stack) -> (!stack.stack, T)
%stk2, %y = "stack.pop"(%stk) : (!stack.stack) -> (!stack.stack, T)
foo(%stk1, %x)
foo(%stk2, %y)
can be rewritten as
%stk1, %x = "stack.pop"(%stk) : (!stack.stack) -> (!stack.stack, T)
foo(%stk1, %x)
foo(%stk1, %x)
Arg pushing
%stk1 = "stack.push"(%stk, %x) : (!stack.stack, T) -> !stack.stack
"cf.br"(%stk1) [^1] : (!stack.stack) -> ()
ˆ1:(%stk2 : !stack.stack):
foo(%stk2)
can be rewritten as
%stk1 = "stack.push"(%stk, %x) : (!stack.stack, T) -> !stack.stack
"cf.br"(%stk1) [^1] : (!stack.stack) -> ()
ˆ1:(%stk2 : !stack.stack):
%stk3, %x = "stack.pop"(%stk2) : (!stack.stack) -> (!stack.stack, T)
%stk4 = "stack.push"(%stk3, %x) : (!stack.stack, T) -> !stack.stack
foo(%stk4)
Pop hoisting
"cf.br"(%stk) [^1] : (!stack.stack) -> ()
ˆ1:(%stk1 : !stack.stack):
%stk2, %x = "stack.pop"(%stk1) : (!stack.stack) -> (!stack.stack, T)
foo(%stk2, %x)
can be rewritten as
%stk1, %x = "stack.pop"(%stk) : (!stack.stack) -> (!stack.stack, T)
"cf.br"(%stk1, %x) [^1] : (!stack.stack, T) -> ()
ˆ1:(%stk2 : !stack.stack, %y: T):
%stk3 = "stack.push"(%stk2, %y) : (!stack.stack, T) -> !stack.stack
%stk4, %z = "stack.pop"(%stk3) : (!stack.stack) -> (!stack.stack, T)
foo(%stk4, %z)