Verifying latency insensitivity

I have been testing some end-to-end handshake designs and experimenting with where to put buffers in the circuits. In one of my most recent tests, a complex circuit terminated execution, but with an incorrect result.

That’s the first time I’ve seen wrong results. My understanding is moving around buffers can prevent deadlock in a design like this, but should never change the result. So, I believe we have at least one bug in the hardware generated from handshake.

I’m trying to see what is a good way to track down such bugs, and verify that our components actually do provide the guarantees we expect. For example, I found this paper: https://arxiv.org/pdf/2102.06326.pdf. I’m learning about their methodology and might try something similar.

I thought I’d ask the community here, does anyone have some experience with verifying the latency insensitivity of handshaking components?

Are you lowering to verilog and simulating, or using the handshake runner?

I am lowering to verilog and simulating, and then comparing the results to the results from handshake-runner.

I left some more details about the specific problem I was looking into here: [HandshakeToFIRRTL] Issues with Verilog generated from Standard involving memories · Issue #543 · llvm/circt · GitHub.

In general, I am becoming increasingly interested in formal verification to gain some more confidence in the Handshake lowering. FIRRTL already an integration test that is doing formal checks with Yosys, so perhaps we could do something similar.

It might be nice to start encoding some information about the firing behavior in the ops as an attribute of some sort. That could potentially guide a pass that generates formal checks for a given op. And, that information could guide the lowering to FIRRTL/etc.

This I think would be a great next step, rather than viewing the complex behavior as a black box. There have been some efforts to defining an ‘actor model’ where an FSM determines which inputs are read/written. One possibility would be to leverage a bunch of ongoing work related to the CAL Actor language and capture this in MLIR: https://hal.archives-ouvertes.fr/hal-00407945/document

That is very interesting. I hadn’t heard of the CAL Actor language, but I just took a quick look and I’ll read up on the ongoing work. I don’t have a ton of time to dedicate to this right now, but I’m definitely interested in collaborating here where I can.

@drom is this a topic you have explored? In terms of your reqack project, I think the analogy would be formally verifying the behavior of the controllers.

@mikeurbach yes, this topic is extremely interesting to me. A lot of work already went into verification of basic set of controllers. I don’t think we should reinvent bicycle here. Here are some general thoughts:

  1. Just saying “valid / ready” protocol is not enough. Specific, and exact behavior must be specified.
  2. Elastic controllers themselves should be designed as generators, not as fixed components. Verification of generators should be performed in full range of parameters.
  3. Elastic controllers not only has to be “correct” but also “efficient” = fast, small
  4. Many constraints has to be checked during construction process of initial elastic circuit. It has to be “correct by construction”.
  5. All “Elastic Transformations” must be verified by themselves before system could use them.
  6. Good “design scoped” elastic controller require special optimization techniques before lowering to RTL.

Thanks for the feedback, I had a feeling you would have some thoughts here. I’m particularly interested in how to model things like the specifics of the protocols, the constraints that must be checked, etc. in a way that allows us to verify/generate/lower elastic components in a uniform way. I think this is a place where MLIR can really shine.

1 Like

I have to agree with @drom above. There is really a ton of work that went into the design of latency-insensitive and speed-independent HW, spread over several research fields: the asynchronous circuit community which introduced the notions of delay insensitivity and speed independence (you can try the book Asynchronous Circuits by Brzozowski and Seger), latency-insensitive design (Carloni et al.), elastic synchronous circuits (Cortadella et al.), endochronous systems, etc. Verification is a difficult topic in asynchronous circuits due to combinatorial explosion, unless there is a “correct by construction” argument (e.g. code obtained from an originally synchronous design, possibly obtained by means of High-Level Synthesis). The CAL formalism seems too high-level and too asynchronous to get you anything more than simulation.

1 Like

Here is little collection of papers: reqack/references.md at trunk · drom/reqack · GitHub
Please send me PR if found other related papers, and like them.

If you are looking into formal verification for digital HW (I just read the messages above) you should look into a literature that spans back more than 25 years, and with existing tools from Synopsys, Cadence, etc. For instance, look into work on symbolic model checking by Ken McMillan.

This is mostly on the synchronous HW side. If you’re looking into asynchronous handshake protocols, the situation is more difficult. Synchronous design still works best, and synchronizers connecting locally synchronous islands inside globally asynchronous circuits are notoriously easy to fool (there’s a nice paper about this by Ran Ginosar, from 2003). And purely asynchronous design is even more difficult.

If I had to, I would suggest making fully synchronous designs, making sure the functionality is the one you want, and (whenever asynchrony is needed) desynchronizing them, i.e. replacing the global clock with standardized handshake protocols that are easy to implement and to prove. The key issue here is semantics preservation.

1 Like

I think this discussion is getting a little ahead of itself. I don’t think we necessarily need formal verification of asynchronous circuits compared to a synchronous circuit. I think there are some basic questions that we can answer relatively simply which are more about properties of handshake the handshake models we are creating. For instance, if we have lowered a loop into a handshake circuit, can we abstract the handshake circuit back into the original loop? Or can we identify invariants that must hold about the generated circuits and then prove those invariants? I definitely agree that it would be a mistake to recreate the experience of the EDA community without reading the literature, but I think what Mike is asking is a somewhat simpler question…

2 Likes

I guess it depends on (the amount of) optimization you want to use. If you optimize the circuit after lowering (including through unrolling, use of caresets, state reencoding, etc.) then it may be difficult, unless you preserve some structural information (and constrain optimizations to preserve this structural information).

Absolutely. We have some abstractions in the Handshake dialect, and some ways to spit out System Verilog for those abstractions. The only thing that is really novel is the abstractions are MLIR operations and the System Verilog is spit out by way of MLIR passes. I don’t think anyone is trying to reinvent a wheel, just porting the existing wheels to MLIR.

To return to my original post, I simulated a Handshake circuit and it computed an incorrect result, so clearly some wheel(s) misbehave. I’m trying to debug this, and I was wondering if formal tools could help. Could we use Yosys to compare our implementation of “fork” to some other golden implementation of “fork”? Could we write our own latency insensitivity testbench? That might be generally useful to anyone doing LI components, like ESI. What MLIR attributes/op interfaces/etc. could we use to guide such a testbench?

I appreciate the links, and I think there is plenty we can do in this area. Unfortunately this is going to be a nights-and-weekends project for me, but eventually, I would like to have full confidence that Handshake’s abstractions indeed implement the behaviors we intended.