LLPM talk during future meeting

I worked on a project called LLPM (Low Level Physical Machine) years ago. Here are the slides which I used in the one talk I gave on it while actually working on it. (Shortly after this talk, I abandoned it due to lack of funding.) It is relevant, but me spending time presenting it during a meeting might be pure vanity as I don’t know if any of the lessons learned would be all that new to this community. This is the talk to which I was referring to in today’s meeting.

If anyone is interested in me giving this talk, please speak up.

Thank you for sharing this John, it looks you’ve had ideas very similar to where CIRCT hopes to go, for quite some time.

On your “exotic” optimizations, I can’t help but think that such things would make very a very unpredictable programming model for users. On the one hand, it is amazing when they kick in (assuming they make things better) but such heroic optimizations are generally pretty fragile. It seems better to build a system where we can provide high level abstraction concepts directly to the programmer, and have those be lowered directly in a predictable way to given hardware.

I gave a talk a few years ago (slide 15+) that spends some time complaining about heroic optimizations in the context of C compilers. It is a similar issue.


For some of the optimizations: my plan is/was for them to be designer-guided via source code and automate the boring stuff. Use telemetry profiling to make suggestions. I think this would be a big win for some of them (partitioning, gearboxing, NoC construction). The more exotic ones are definitely academic projects though.

You’re right about the relative unpredictability though: for the more exotic, automated things (eg hardware merging) the results would be somewhat unpredictable and non-linear with minor changes. But that’s already the case w/ EDA tools, so I’d be nothing new to designers vs software programmers. You’re also right about the complier explaining optimization choices. It’s very difficult for some optimizations (eg pattern matching based ones) purely due to the number of possibilities. So very research-y.

We spent some time trying to eek out the unpredictability issues in C-based HLS tool with the tldr being that small code changes radically alter final designs and result in weird area-latency trade-offs.

EDA tools do have this kind of unpredictability built in. For example, I think Vivado carefully replaces 4-stage multiplies with DSPs, but generates much larger harder for anything else. However, there are some projects which are trying to tackle these problems by making all of these decisions explicit in the language (either verilog or a new IR for specifically expressing design constraints).

1 Like

Really interesting paper! I added it to the meeting notes for others who may be interested!

@rachitnigam this paper is really really interesting. Would you be interested in presenting on it in the meeting sometime? It would be great to get more PL / type system content into the meeting - I suspect that most people are not familiar with affine types.

1 Like

Thanks @clattner! I can absolutely present this paper during one of the future meetings.

In the meantime, here are links to:

  1. A web-based demo (no installation required)
  2. Hour long talk I gave at Berkeley on Dahlia


It would be great to have you give a talk!

A few comments on your paper: I think most of the effects you see are caused by looking at very small designs, where certain costs (like muxing and the cost of non-power of 2 modulos in FPGA hardware) dominate the costs that are more often considered critical (multiply-accumulates). My opinion is that there are different HLS tools with different goals. Some attempt to enable hardware engineers to express detailed architectures more easily (I tend to lump them under the heading of “Better Verilog”). Other tools focus on enabling more complex algorithms to be implemented without getting bogged down in the details (“Algorithmic Synthesis”). Generally speaking, Vivado HLS focuses on the latter rather than the former. I’d like to see CIRCT make improvements in both areas, which are somewhat complementary, since ‘better Verilog’ should enable better ‘algorithmic synthesis’.