Handshake lowerings for merge-like and branch ops

I’ve been taking a look at the FIRRTL lowerings for Handshake's ops that implement merging and branching (the MergeLikeOpInterface ops MuxOp, MergeOp, ControlMergeOp, and ConditionalBranchOp). Right now all of those lowerings make use of the FIRRTL WhenOp (and appear to be the only use of the FIRRTL WhenOp).

In a previous discussion, Schuyler pointed out that these mid-level FIRRTL ops are fairly complex.

Right now, there isn’t much support for these ops beyond creating them. They can’t be lowered to other dialects or printed with EmitVerilog.cpp. Transformation(s) can be added to flesh out support for WhenOps, but there might be an opportunity here to improve the Handshake lowering for the long term.

Instead of relying on the mid-level semantics of FIRRTL, these ops could be lowered into a more explicit implementation at a lower level. I’m imagining using low-level FIRRTL constructs that have analogues in the RTL and LLHD dialects. Essentially, combinational logic and registers.

This would allow us to define exactly the circuit we want to generate for these ops in a way that is less coupled to FIRRTL's semantics. It can be tricky to get the handshaking semantics right (i.e. waiting for inputs, not introducing combinational cycles, etc.). To me, this is actually an argument in favor of making the lowerings of these ops more explicit.

There are examples of how to implement these kinds of ops in hardware:

  • the elastic components in Dynamatic, like Mux
  • sections 3.3 and 3.4 here have nice visualizations and explanation for Mux, DeMux, and ControlMerge (and mention a Merge that is simpler than ControlMerge)
  • certainly more… anyone have other references?

Is there interest in moving away from WhenOp in these lowerings? If so, I’d be interested in continuing the discussion and working on the implementation.

As an aside, I’ve been discussing with @darthscsi about trying to map high FIRRTL ops (like when) directly to RTL dialect or Verilog/SystemVerilog dialect. I.e., we can probably get better code gen by not going through the whole lowering process and instead try to preserve higher-level FIRRTL semantic information. This is untrodden territory (as the Scala FIRRTL Compiler didn’t go this route), so it’s unclear what the workload is like here.

I agree that it’s reasonable to circumvent the problem for Handshake dialect by going directly to RTL dialect or to low FIRRTL ops. :+1:

1 Like

It would be great if we can have the direct lowering to low FIRRTL representation and get rid of the WhenOps considering the complex transform of WhenOp can’t be landed in a short term.

The reference you mentioned is very helpful! Just FYI, in the current lowering of CMergeOp and MergeOp, we implement an implicit and simple arbitrator with nested WhenOps like this:

firrtl.when %in0_valid {
  firrtl.connect %out_data, %in0_data : !firrtl.flip<uint<64>>, !firrtl.uint<64>
  firrtl.connect %out_valid, %in0_valid : !firrtl.flip<uint<1>>, !firrtl.uint<1>
  firrtl.connect %in0_ready, %out_ready : !firrtl.flip<uint<1>>, !firrtl.uint<1>
} else {
  firrtl.when %in1_valid {
    firrtl.connect %out_data, %in1_data : !firrtl.flip<uint<64>>, !firrtl.uint<64>
    firrtl.connect %out_valid, %in1_valid : !firrtl.flip<uint<1>>, !firrtl.uint<1>
    firrtl.connect %in1_ready, %out_ready : !firrtl.flip<uint<1>>, !firrtl.uint<1>
  } else {
    ... ...

Thanks, yep, I was taking a look at that.

How about I open an issue for this on GitHub and assign myself?

Please do that! We can have a further discussion there.

Here you go: https://github.com/llvm/circt/issues/178

1 Like