Switch statement missing

I read this [RFC] Rename LoopOps dialect to SCF (structured control flow)

But the scf dialect does not seem to contain a switch statement. It also does contain a goto implementation, which is important to implement state machines, such as re2c[1] might generate, espectailly as LLVM lacks a sufficiently developed jump threading pass to optimize a loop+switch state machine[2].

[1] http://re2c.org/
[2] https://codasip.com/2019/09/04/better-benchmarks-through-compiler-optimizations-codasip-jump-threading/ (The code was not released).

Yup! I’d love to see scf.switch and scf.sign(x) (which has three branches instead of two branches in scf.if, corresponding to x == 0, x < 0, and x > 0). While I agree that optimizing state-machine like code is important, having an explicit FSM dialect with appropriate lowering might also be a good solution.
Patches welcome!

I am working on a dialect which includes state machines as a top level construct. A pass then converts this to a number of functions.

I think our dialect is a bit domain specific, but in case someone wants to discuss this I would be happy to.

1 Like

Not blocks/regions? Do your needs resemble re2c? I am just trying to get a starting point.

Do the conversion passes run in parallel? Cause a switch statement should be converted directly to LLVM until such point that it is possible to create jump tables in the middle end (requires tagging memory locations go the code gen to fill in).

Basically, we have a number of triggers that is the result of external events (we are modelling hardware on a very high level). So our dialect look something like this:

func @foobar() ...

statemachine foo {
^a:
   state a {/* entry actions */} { /* exit actions */ } {
      // Transitions
      transition ^b @foobar { /* guard */ 
         %true = constant true
         yield(%true) : (i1) -> ()
      }  {
        /* transition actions */
      }
   }
^b:
  state b {} {} {}

}

Since our triggers are asynchronous, it is hard to model this with goto-like constructs, and we keep state in a current state variable associated with each state machine.

I think that a general state machine dialect will most likely run into issues like this, that in some cases they should be lowered to a goto-based implementation, while in others we need to explicitly track state.

But state machines fundamentally work on an input stream. Look at protothreads http://dunkels.com/adam/pt/

Even if you do it asyncronously, you can implement it with goto overlaid with a switch statement to resume at a given state, as protothreads does. Having the entry point a whole bunch of functions risks some of the functions being invalid in certain states.

How about this:

sm.machine foo %input_stream -> %yield_signature {
^a:
   state a (%input_stream) {
     sm.yield ^b, %foo
   }
^b:
  state b {...}
}
-and-
sm.resume foo %input

the getstate would requite setting a state type in the definition, and then with a constant for every state listed (instead of having the compiler figure it out for you).

Interesting, this actually pretty quickly becomes co-routines, and the current LLVM co-routines were found not useful by zig and rust because they do not decouple the allocation and deallocation of the stack frames (and in the protothreads example there is no stack).

I see the point, and you are right, that would indeed work out as well (I say almost, but I get to that). We ended up at the current solution since it was essentially the natural fit to the source language, which provides interfaces (and simulated timers) for triggering state transitions. So, the transition functions already exist, materialised in the interfaces.

I say almost, because each transition trigger can receive an arbitrary number of parameters, which will be passed on as action arguments. So our input streams are in fact not any simple types, but more a sum-types with all trigger function arguments.

Our current lowering approach simply populates the transition trigger functions so the state machine gets spread out in multiple functions just before lowering to LLVM. The language also has a concept of failure detection, so the invalid use of interface functions can be detected and handled accordingly.

It seems we are more or less on the same page here. And it would definitely be nice to have something more generic.

Can you elaborate on this a bit more? which operations are terminator operations? Do you have a notion of state variables which are in-scope in the actions and guards? I’ve been thinking about something like this as part of the circt project: https://github.com/circt/circt I’ve wondered whether it was easier to use the block structure to model states, or to have a symbol-table lookup for state names. You seem to have done both?

Quick answers, I will try to write something more elaborate later, maybe on a separate thread.

State variables are only available in a “component” which is essentially a class. So member variables only. The exception is parameters to guards and actions, both which handle the arguments to the interface function triggering the transition.

Initially I worked with symbol states, but it was more convenient to define the targets of the transitions with block labels, so our current dialect has both available, but I suppose there is no technical need for the state symbol table anymore, it is nice for debugging though.

I came across this thread while looking at a lowering of mhlo.case to standard control flow. I don’t see scf.switch, but it seems like maybe it should be added. Similarly, should there be sibling op to std.cond_br that would branch to a block based on an index?

Because apparently my second sentence there confused people, I meant that in addition to something in scf, which I imagine would use regions similar to scf.if I think we’d want a block-based branching switch statement in std (or wherever the control flow ops from std end up) that’s basically identical to llvm.switch.

I started an RFC, since I’m proposing adding an op to std: [RFC] Add std.switch and scf.switch