Optimizations and the RTL dialect

Picking up from FIRRTL aggregate types discussion - #22 by stephenneuendorffer.

I agree. I see the RTL dialect as the common denominator, so any types which apply to multiple HW dialects should be put there. As such, I don’t see a need for any types in the SV dialect (at least the synthesis, digital portion) though I might be missing something.

In fact, a good many of the ESI data types are only there since they don’t exist elsewhere. They could easily be renamed and moved into the RTL dialect.

RE graph regions: I agree 100%.

+1 from me on Graph regions as well.

Agreed that the RTL dialect is the lowest common denominator, and it makes sense to put new types there. If higher level types existed in the RTL dialect, I think the SV dialect and EmitVerilog would have a pretty clear path to emitting good System Verilog.

I’m pretty interested in the ESI types. For me, I’m thinking about lowering Handshake, but I think having such types would be useful for conversion from FIRRTL and potentially LLHD as well. I’m curious what others think, but I think the RTL dialect could be a good place to put that type system.

Right now it looks like RTL ops only deal with AnySignlessInteger from standard MLIR, so having a type system for RTL seems like a thing we should be considering at the very least.

If there’s support for this, I’ll re-target some of the trivial ESI types into the RTL dialect and PR that. (Which I’d prefer so that I don’t have to refactor a bunch of code later on.)

That sounds good to me.

@fabianschuiki Your LLHD paper implies that LLHD supports the struct data type. You just wrap a tuple in your SigType, yes?

How’s this look?

That’s looking good to me. The tablegen facilities for types are super helpful.

@jdd Yes exactly. The structs are just a tuple in LLHD itself, inspired by LLVM. I would expect things like field names, which should be visible in the simulation trace, to be more of a debug info added on top of it.

No support for unions yet, though. Quite sure that’s an oversight because I haven’t gotten around to implementing them in the Moore compiler, which has been the main driver of IR additions :innocent:

:+1:

That looks good! The prospect of having a solid set of types for the RTL dialect as the “core” is nice.

I’m thinking about a common RTL type for arrays. My thought is to have a tensor of sorts but in contrast to the std tensor, all the dimension sizes would be statically known at compile time. The asm syntax would be !rtl.tensor<3 x 4 x 5 x i4> to represent a 3 dimensional array of i4 with sizes 3, 4, and 5 respectively. I’m also thinking that this would eventually be lowered to a SV packed array, though I don’t think we would define this semantic in the data type itself. An array would just be a special case of this tensor type. Sound good?

@fabianschuiki Is this compatible with the LLHD notion of an array? Would you switch to this if it were available in the RTL dialect? (I want to push as many types as possible into a common type system to make inter-dialect communication easier.)

FIRRTL guys: same question to y’all. It looks like the FIRRTL dialect has the additional restriction that it can only contain other FIRRTL types.

@jdd Yeah fully agreed! It would be great to boil things down to a core set of common types. I guess most dialects will be able to agree on the basics: array, struct, union, int.

There was some discussion of packing in this issue Add support for lowering vectors of wires from FIRRTL to RTL · Issue #18 · llvm/circt · GitHub (in the context of lowering from FIRRTL to RTL). I think the proposed !rtl.tensor type, or something like it, would fit nicely into the RTL dialect, and we could specialize the semantics w.r.t. packing, etc. as we lower to SV, as you said.

I haven’t thought enough about this, but I think we should be very careful about adding things opportunistically to the rtl type system that isn’t proven in existing systems. For example, I don’t see how a tensor type is more useful than a traditional multidimensional array.

That said, I do think we should allow arbitrary (cross dialect) types as ports in rtl.module. This allows us to define dialect specific type systems and dialect specific operations that intermix nicely.

You’re right. This is more of a multidimensional array. I was just looking for a shorter word (for a mnemonic) and using the term loosely.

100%! I’m further thinking that a standardized set of types would allow much easier dialect inter-op and think that the RTL dialect is the place to put them. I originally had the basic types (array, struct, union, enum) in the ESI dialect but since they’ll be used for more than just communication, removed them and am planning on using the ones in RTL instead.

What do you mean by ‘proving’ and ‘existing systems’? If you’re talking about existing hardware systems (in SystemVerilog), I think that arrays, structs, unions, and enums are pretty well understood. If you mean fleshed out in other dialects first then moved into the RTL dialect, that’s something we can do. It’d be pretty easy to move them in into RTL once we decide they’re stable enough to ‘standardize’.

Cool, I added this to the agenda to talk about in tomorrow’s meeting. It seems like a pretty good and fundamental discussion point.

Right, I’m less worried about union, arrays, structs, enums than things like tensors :-). That said, each of these are surprisingly complicated to define, so doing this carefully over time makes sense. By allowing other dialects to define types, we also have a pressure-relief valve that doesn’t block experimentation.

-Chris

I’m going to suggest starting with Array since I think its the simplest. IMO it should be multi-dimensional, have static dimension size, be typed, and represent a packed array in SystemVerilog, though maybe we should leave that up to the SV lowering and/or emitVerilog. We can evolve it so support dynamic sizes (for simulation). The asm syntax I’d like: !rtl.array< 4 x 10 x i4 > for a 4 by 10 array of i4.

Comments?

1 Like

I would make this be a single dimension at a time (so it composes with other types naturally, including subarrays). How does this handle the weird verilog packed vs unpacked dimensions? Are these two different array types that compose?

Fair point. One reason to to multidimensional (not a good one – I’m just playing devil’s advocate) is that it can better represent the notion of packed arrays. So if you had a multidimensional array as the inner type, that could be inferred to be the packed dimensions then you could specify an array of that array to specify the unpacked array. I don’t see that composing well, however; I’ll agree with you on the single dimension.

I would say it specifically doesn’t. Unpacked vs packed arrays are (IMO) logically the same (or least largely similar), though they have different trade-offs in the SystemVerilog language. The packed dimensions are generally considered to be part of the inner type, so they can be assigned in one statement. One cannot assign an entire unpacked array in one statement. Unpacked arrays are (IIRC) used for memory inference but I’ve found that capability to be very synthesis tool specific and often doesn’t do what you want. (So we always use the parameterized ‘primitive’ module to manually instantiate memories.)

Anyway, it’s more of a language thing (IMO) so I don’t think it has any place in the RTL dialect, which I see as more of a logical-only dialect. I don’t think it’ll matter much for languages which don’t interact with hand-written SV.

Right, that’s what I’m getting at. For simplicity, think about this as an s-expression, where an array of 42 i32 elements is (array 42, i32). A single-dimension representation allows you to have multidim arrays by composing them like: (array 10, (array 50, i17)).

What I’m suggesting is that unpacked dimensions would be a different kind of type, e.g.: (unpacked_array 10, (array 50, i17)). This allows us to reason about these as “similar but different” things at the right places, and it would compose correctly.

Also, if you’re interested in arrays, it might be interesting to check out the surprisingly insane C semantics for arrays, which you can see in the Clang ArrayType.

Actually, I may be wrong about this. Packed vs unpacked may affect layout when casting from existing data (e.g. external memory, network, PCIe, etc.) The layout of unpacked arrays is not defined in the spec AFAICT.

Understood.

I agree but I don’t see a need to represent an unpacked_array in the RTL dialect. If I’m wrong we can always add it later!

One SV limitation is that a packed array cannot contain an unpacked array (or anything not packed). A potential upside of not differentiating between them in the RTL dialect is the the lowering to SV (however we end up doing it) can decide where to specify packed vs not based on where it has to go. I’m not 100% sure this is a good idea since we may not want different behavior depending on context, though it may not matter. We’ll have to cross that bridge when we come to it. (No use arguing before we have a more solid understanding of the problem.)