Proposal for Per-instance Metadata

Hardware has a fundamental complexity: there exists per-instance metadata (e.g., floorplanning information) that is not captured well by Verilog or in existing CIRCT dialects.

Fundamentally, each instance of a module may have metadata associated with it. This metadata may be shared or not shared with other instances of the same module.

A design goal is to represent all metadata in the same MLIR representation (though possibly using multiple IRs). Another design goal is not to duplicate the IR based on the application of metadata.

The purpose of this post is to concretely continue the discussion started in: Continuing the conversation on per-module versus per-instance…. An immediate need is to provide a solution for this in the FIRRTL dialect. I expect that we need a general solution that works for all dialects.

Motivating Examples

Consider the following Verilog:

module Bar();
module Foo();
  Bar c();
  Bar d();
module Top();
  Foo a();
  Foo b();

Listing 1: A Verilog description of a module with two levels of instance hierarchy.

In this circuit, there is one instance of Top, two instances of Foo, and four instances of Bar. These instances are enumerated here:

  • Instances of Top: Top
  • Instances of Foo: Top.a_Foo, Top.b_Foo
  • Instances of Bar: Top.a_Foo.c_Bar, Top.a_Foo.d_Bar, Top.b_Foo.c_bar, Top.b_Foo.d_bar

Metadata can then be applied to any instance in the design. (Formally, metadata can be associated with any part in any of partition of a set of instances.)

Example 1: ResetVectors

Consider the situation where the circuit is a 4-core design. Each Foo is a cluster of cores and each Bar is a core. All cores need a reset vector, but one core gets a special reset vector that is the actual program (that core boots and brings the others up via interrupt). The rest get a different reset vector that drops them into a wait loop. Here, there is special metadata for, say, Top.b_Foo.d_Bar (the main core) and all other instances of Bar (the waiting cores) get the same metadata.

Example 2: Per-cluster Layout

Consider the same 4-core design. Now you want to do layout, but you only want to layout each cluster. Here, there is different metadata associated with Top.a_Foo and Top.b_Foo. However, This is hidden from any of the Bar instances.

Example 3: Full Layout

Consider the same 4-core design. Differing from the previous example, now everything is explicitly laid out and every cluster and core gets explicit layout info.

The aim of this proposal is to keep the IR fully de-duplicated as in Listing 1, while not sacrificing metadata expressiveness.

Alternative Description: Folded and Unfolded Instance Graphs

If the motivating examples makes sense, you can skip to the next section. This subsection provides a graphical view of the problem which I find useful.

You can think of Listing 1 as a directed, labeled graph. Nodes are modules, edges are instantiations, edge labels are instance names, and edge direction indicates instantiator–instantiatee relationship. (The direction convention doesn’t matter—either works.)

Figure 1: An “instance graph” representation of Listing 1. Top instantiates two copies of Foo named a and b. Foo instantiates two copies of Bar named c and d.

However, this is not what the actual design looks like on a chip. The instance graph is “unfolded” such that each node is instantiated once. This can be represented as:

Figure 2: An unfolded instance graph where every instance is a module. This is an implementation of Example 3.

Figure 2 still doesn’t look like a chip layout, though. You can reformulate this into Figure 3, below:

Figure 3: An equivalent representation of Figure 2 that looks more like a chip layout

You can then think about Figures 1 and 2 as different “extremes” of “foldedness” of Listing 1. Figure 1 is fully deduplicated. Figure 2 is fully duplicated. You can then transform back and forth as needed. However, there exists intermediate, partial foldings where specific sets of instances are uniquely instantiated.

Figure 4, below, shows one such partial unfolding. One instance of Bar is split out. Note that this requires splitting out one instance of Foo as well. This is an implementation of Example 1.

Figure 4: A partially unfolded version of Figure 1. Here, just Top.b_Foo.d_Bar is split out from the rest of the design.

An alternative partial unfolding involves separating each instance of Foo. This is shown in Figure 5, below. This is an implementation of Example 2.

Figure 5: Another partially unfolded version of Figure 1. Here, each instance of Foo is split out.

Alright, so how do we manage this complexity? Read on…

Proposal: Per-instance Parameters

MLIR has robust support for attributes. Any operation can have arbitrary information packed into an attribute. The FIRRTL Dialect already uses this to convert Chisel-produced metadata stored in JSON to MLIR attributes. (In FIRRTL lingo, this metadata is called Annotations.)

E.g., consider the following FIRRTL circuit modeled after Listing 1. The top module, Top, instantiates two copies of module Foo, a, and b. Module Foo has a wire called x:

circuit Top:
  module Foo:
    wire x: UInt<1>

  module Top:
    inst a of Foo
    inst b of Foo

Listing 2

The following annotation file uses Target syntax (for more info see: llvm/circt/blob/main/docs/ to attach the metadata "hello" to wire x:

    "hello": null,
    "target": "~Top|Foo>x"

Listing 3

This is all sucked into CIRCT with firtool Foo.fir --annotation-file Foo.anno.json to produce:

firrtl.circuit "Top"   {
  firrtl.module @Foo() {
    %x = firrtl.wire  {annotations = [{hello}]} : !firrtl.uint<1>
  firrtl.module @Top() {
    firrtl.instance @Foo  {name = "a"}
    firrtl.instance @Foo  {name = "b"}

Listing 4

The metadata "hello" gets attached to wire x. The approach of attaching metadata via attributes works great and is reusable by any CIRCT Dialect. However, this approach is insufficient to deal with attaching metadata to one wire x along a specific instance path.

So, what to do?

The proposed approach is to expose hierarchy information to instances.

Once hierarchy information is available, we can then use a new conditional attribute that is only valid if the hierarchy information matches. This requires three new features for a hardware dialect that wants to support per-instance metadata:

  1. Modules (e.g., firrtl.module, hw.module) can take arbitrary non-port parameters.
  2. Instances are provided arguments initially derived from their hierarchy.
  3. Attributes are conditionally applied based on the hierarchy argument.

Building on the example above, what happens if we want to annotate just Top.a_Foo.x? Well, if we pass hierarchy information into each instance, then the metadata can be conditioned on the hierarchy.

firrtl.circuit "Top"   {
  firrtl.module @Foo(path : StringAttr) {
    # This annotation will only apply if path == "a".
    %x = firrtl.wire  {annotations = [{condition = "a", hello}]} : !firrtl.uint<1>
  firrtl.module @Top() {
    firrtl.instance @Foo("a") {name = "a"} : (StringAttr) -> ()
    firrtl.instance @Foo("b") {name = "b"} : (StringAttr) -> ()

Listing 5

Here, the condition is "a", meanting that the annotation is only valid if path == "a". The instance information is then passed into both instances of module @Foo.

It is critical to note that the value of the parameter doesn’t actually matter here. All that matters is each instance needs to be able to be uniquely identified. (E.g., "a" and "b" are useful to use, but you can use anything here like numbers, ids, JPEG images, etc.) It follows that the arguments do not need to be updated when the hierarchy changes. However, adding a new instance that takes parameters does require updating the hierarchy to avoid the possibility of a collision.

Two alternatives for parameters is:

  1. Use a globally unique ID for every instance in the design.
  2. Use a globally unique ID for every instance of each module in the design.

Preservation Across Hierarchy Manipulations

Here, we add one extra level of hierarchy to make the circuit look just like Listing 1. Each instance of Foo instantiates two copies of Bar. Only one x in Top.b_Foo.d_Bar has metadata. What changes here is that the arguments passed to instances of Bar need to be given more information about where in the hierarchy they are. This can be achieved with simple concatenation of path arguments:

firrtl.circuit "Top" {
  firrtl.module @Bar(path: StringAttr) {
    %x = firrtl.wire attributes {annotations=[{condition = "b.d", “hello”}]} : !firrtl.uint<1>
  firrtl.module @Foo(path: StringAttr) {
    %c = firrtl.instance @Bar(path + ".c"): (StringAttr) -> ()
    %d = firrtl.instance @Bar(path + “.d”): (StringAttr) -> ()
  firrtl.module @Top() {
    %a = firrtl.instance @Foo("a"): (StringAttr) -> ()
    %b = firrtl.instance @Foo("b"): (StringAttr) -> ()

Listing 6

Now consider inlining all instances of module Foo. There is no non-local effect. Argumets are just substituted:

firrtl.circuit "Top" {
  firrtl.module @Bar(path: StringAttr) {
    %x = firrtl.wire attributes {annotations=[{condition = "b.d", “hello”}]} : !firrtl.uint<1>
  firrtl.module @Top() {
    # Original code: %a = firrtl.instance @Foo("a"): (StringAttr) -> ()
    %a_c = firrtl.instance @Bar("a" + ".c"): (string) -> ()
    %a_d = firrtl.instance @Bar("a" + ".d"): (string) -> ()
    # Original code: %b = firrtl.instance @Foo("b"): (StringAttr) -> ()
    %b_c = firrtl.instance @Bar("b" + ".c"): (string) -> ()
    %b_d = firrtl.instance @Bar("b" + ".d"): (string) -> ()

Listing 7

The absolutely critical thing here is that there is zero hierarchy tracking that needs to happen. All modules can continue to be worked on individually. Note, that if a new instance is added, the paths need to be canonicalized to avoid the possibility of a collision. Consider what happens if we add a module Baz that instantiates Bar:

firrtl.circuit "Top" {
  firrtl.module @Bar(path: StringAttr) {
    %ref = firrtl.wire attributes {annotations=[{condition = "b.d", “hello”}]} : !firrtl.uint<1>
  firrtl.module @Baz(path: StringAttr) {
    %d = firrtl.instance @Bar(path + “.d”) : (StringAttr) -> ()
  firrtl.module @Top() {
    %a_c = firrtl.instance @Bar("a" + ".c") : (StringAttr) -> ()
    %a_d = firrtl.instance @Bar("a" + ".d") : (StringAttr) -> ()
    %b_c = firrtl.instance @Bar("b" + ".c") : (StringArr) -> ()
    %b_d = firrtl.instance @Bar("b" + ".d") : (StringAttr) -> ()
    %b = firrtl.instance @Baz(“b”) : (StringAttr) -> ()

Listing 8

What’s going on here is that the paths are out of date with respect to the hierarchy so the parameter "b" shouldn’t be passed to the instance of Baz. This can be fixed by first canonicalizing paths to match the hierarchy:

firrtl.circuit "Top" {
  firrtl.module @Bar(path: StringAttr) {
    %ref = firrtl.wire attributes {annotations=[{condition = "b_d", “hello”}]} : !firrtl.uint<1>
  firrtl.module @Top() {
    %a_c = firrtl.instance @Bar("a_c"): (StringAttr) -> ()
    %a_d = firrtl.instance @Bar("a_d"): (StringAttr) -> ()
    %b_c = firrtl.instance @Bar("b_c"): (StringAttr) -> ()
    %b_d = firrtl.instance @Bar("b_d"): (StringAttr) -> ()

Listing 9

After this, there is no problem adding module Baz to the circuit.

Also note that Grouping (the inverse of inlining) is also trivial as long as you follow argument substitution. Below, a new module, Qux is created. No hierarchy information needs to be updated:

firrtl.circuit "Top" {
  firrtl.module @Bar(path: StringAttr) {
    %x = firrtl.wire attributes {annotations=[{condition = "b.d", “hello”}]} : !firrtl.uint<1>
  firrtl.module @Qux() {
    %a_c = firrtl.instance @Bar("a" + ".c"): (StringAttr) -> ()
    %b_c = firrtl.instance @Bar("b" + ".c"): (StringAttr) -> ()
  firrtl.module @Top() {
    %a_d = firrtl.instance @Bar("a" + ".d"): (StringAttr) -> ()
    %b_d = firrtl.instance @Bar("b" + ".d"): (StringAttr) -> ()
    %baz = firrtl.instance @Qux() : () -> ()

Listing 10

Alternative Proposals

There are two main alternative proposals to the above. The first involves use of cross module references to attach metadata information. The second involves maintaining a separate instance hierarchy just for the purpose of tracking metadata.

Cross Module References

A cross module reference (looking downwards) can encode the exact same instance-specific metadata. However, this requires some mechanism to “dot” into each instance. The expected MLIR way of doing this is with nested symbol tables. This is expected to look something like the following:

firrtl.circuit "Top" {
  firrtl.module @Bar(path: StringAttr) {
    %x = firrtl.wire attributes : !firrtl.uint<1>
  firrtl.module @Foo(path: StringAttr) {
    %c = firrtl.instance @Bar @c(): () -> ()
    %d = firrtl.instance @Bar @d(): () -> ()
  firrtl.module @Top() {
    firrtl.instance @Foo @b: ()
    firrtl.instance @Foo @b: ()
    annotate @b::@d::@x "hello"

Listing 11

This defines a new metadata attachment operation called annotate that includes a symbol path and some metadata. This operation can be placed at the lowest common ancestor or higher necessary to uniquely identify the annotated instance.

This gets tricky for a number of reasons (that may just require more thought).

First, the symbol path may necessitate modules maintaining per-instantiation symbol tables. Something like @b::@d::@x is a cross module reference that is operating on some hypothetical instance symbol tables @b and @d. These overlap with @Foo and @Bar. However, they are different. It is possible that the same path could be represented as @Foo::@Bar::@x. This breaks how nested symbol tables work as @Bar is not inside @Foo. A further alternative could be to use a list of sybmols, i.e., [@Foo, @Bar, @x].

Second, module hierarchy manipulations are no longer local. Consider the case of wanting to inline @Bar into @Foo. This action requires checking for any cross module references in the design and updating them. I.e., @b::@d::@x needs to change to @b::@x2. This is contrasted with the parameter approach where no global update is necessary.

That said, we are expected to eventually need a cross module reference primitive as an operation that can represent SystemVerilog hierarchical paths (see: Cross Module Reference (XMR) Primitive · Issue #933 · llvm/circt · GitHub). We may need to solve this problem eventually.

Separate Data Structure

As another alternative, a completely separate datastructure or instance hierarchy can be maintained. Transforms then need to update this when they make modifications to the circuit. Similarly, they can query the datastructure to find out if there is any metadata associated with specific things (or extract all metadata of a specific type and then look at where it is applied).

This is essentially the approach that the Scala FIRRTL Compiler takes. The implementation of the methods necessary to keep the datastructure up to date are complicated (see: RenameMap.recursiveGet). Additionally, the overhead of requiring users to provide a “rename map” describing how the datastructure needs to be updated is non-trivial overhead for a transform writer. It seems easier to require users to write transforms that handle operations which may have attributes.

Notes on the Scala FIRRTL Compiler Implementation

The Scala FIRRTL Compiler (SFC) does not support attributes on its IR. All IR is stored in a separate datastructure that is a sequence Annotations. Each annotation uses a Target to attach metadata to zero or more things in the circuit. Annotations are local if they are associated with a module (they have no instance path). Annotations are non-local if they include an instance path. This is specific enough to refer to any instance or part of a partition of instances. SFC infrastructure then has robust support for a 3-step pattern of transform writing:

  1. Unfold the circuit such that any non-local annotations you care about are now local
  2. Run your transform
  3. Fold the circuit back via structural deduplication

There is terse information about Annotations and Targets in llvm/circt/blob/main/docs/ and voluminous extra information about this problem and SFC’s solution to it in Chapter 4 of Adam Izraelevitz’s thesis.


Great write up!

Do you mean “de-duplicated”?

Isn’t the real problem that the “.d” identifier used by multiple instances? And could therefore be solved by requiring all instances to have globally unique identifiers instead of a canonicalization?

Couldn’t that problem be solved via the “per-Instance Parameters” approach? I haven’t thought too deeply about this, but something like each source (the XMR which is referring to a specific instance) would return a value which could be plumbed through the hierarchy as results (going up) and operands (going down) into the target module and have the target have a conditional on that value?

I like your “per-Instance Parameter” approach. At first blush, it feels like an abuse of the Value system. But upon some meditation, it actually matches my mental model of “running” the design (which I think of as a program) to elaborate the design. In my mental model, instances roughly correspond to function calls and the behavior of function calls depends on its parameters. The reason this feels like an abuse is that (again in my mental model) we actually have two different types of Values: those that actually get synthesized into the elaborated design (to customize the behavior at the user runtime) and values which get used at elaboration time (design program execution) to customize the hardware. We only support the former of those two currently. We could use the type system to differentiate the two different classes of Values.

On a not-completely-unrelated note: in my mental model, when we output to verilog we are actually doing a “transpile”: outputting another design “program” to be executed (elaborated) by another compiler. This is roughly analogous to “compiling” to C and handing that off to a C compiler.

For my placement problem, the output is tcl which refers to specific instances in the fully elaborated design. In my mental model, my “export” would actually be compiling (rather than “transpiling” it) the design (elaborating it) and could actually interpret (run) the program, filling in the “parameter” values during the compilation (elaboration).

All around, this idea is very much in line with my mental model. I think something like this would/could also apply to the more general IR parameterization problem as well.

Yes! Fixed above.

The issue is that the "b.d" identifier is aliasing to multiple instances. I think this is saying that you can use hierarchy information to construct globally unique identifiers if arguments are up-to-date (canonicalized may not be the right word). Otherwise, you need to fall back to some authority to assign them that has global scope.

I’m specifically highlighting this example because when I originally thought of this (after much prompting from Lattner that this problem smelled like parameterization) it initially seemed obvious to me that arguments had to always be kept up to date. (Others had the same reaction.) However, the parameters, derived from the hierarchy, are really just, I think, a way to locally get a globally unique ID. There’s no actual value in matching them to instance names after construction (or an alternative construction that provides the same guarantees may work).

This is interesting. I hadn’t thought about it this way…

I usually think about XMRs as possibly resolving to bored through connections not as something like the inverse. However, treating these as parameters could work. :thinking:

My worry here is the trash fire that is Verilog hierarchical references that look upwards to find a matching instance. Downwards facing XMRs (from some root downwards) seem like they could work with the per-instance idea.

It weirdly does seem to match the mental model. It also seems to indicate that the Annotation system in the Scala FIRRTL Compiler may just be a mechanism to encode parameterization in an unparametric IR.

Ignoring all that, I’d like to flesh out this idea with a prototype in the FIRRTL Dialect and see where this goes.

I know nothing about SV XMRs (I immediately lost interest after learning that they weren’t synthesizable years ago), but isn’t this a problem regardless of the IR representation?

I’ve had to actively stop myself multiple times today from writing up a prototype and trying to use it for my placement problem. It’s a good problem to use to procrastinate (but I can’t afford to procrastinate right now). It seems like a PoC would be pretty simple to hack together.

I still have the issue of associating an arbitrary attribute on an operation to that parameter value. (Your “conditional” field needs to know what Value to reference.) The ops to which I’m attaching the location attributes shouldn’t have to know about a parameter value. One interesting solution I’ve been noodling is having that parameter value feed a compile-time control op (e.g. an if or switch) – go full meta-programming. For my application, that could be wasteful simply due to IR bloat: one op per instance of the op I want to place gives me heartburn.

A few comments:

  1. I think your graphs in the beginning (primarily Figure 1) are fundamentally missing the separation between the op that defines a module and the ops that instantiate a module. To me, the ‘fundamental problem’ seems to be that until the graph is fully elaborated (as in Figure 2), there is no structure that is 1:1 with the structure of the final elaborated design (containing N instances), and hence nowhere to put the attributes for those instances.

  2. in Listing 6, it seems like the ‘annotations’ list of %x will eventually contain O(N) elements. In the case where one has a large design, the linear lookup may be expensive. The ‘solution’ seems obvious: the tree-structure of the design would be replicated in the annotation. But this means that the tree structure would then be duplicated in every requiring annotation. This seems bad. (and also degenerates into the “separate data structure” approach) This would also seem to be a problem with the ‘annotate’ approach.

  3. Your argument seems to fundamentally be that “inlining and grouping are common operations, hence we should work hard to make them easy to write and efficient.” It’s probably worth some investigation whether this is true or not. In my experience, these are operations much rarer than operations within a level of hierarchy (most optimizations) so I question whether they need to be efficient. They also seem to be so canonical, that we can write them once and put them in the core system, so I don’t necessarily thing that they need to be convenient to write. I think we should definitely consider: 1) simplicity of representation if this is something that’s going to be used everywhere, 2) compactness of representation, since this will likely need to be applied to very large designs, 3) cost to insert/delete/lookup annotations.

1 Like

:thinking: I’m a bit confused by this. I’d clarify that the op that defines the module is a node and the op that instantiates is a labeled edge. I’m obviously missing something, though. Sorry.

I do admit that there is a structural difference between Figures 1 and 2 (and 4 and 5). What I’m searching for is what you state, “[a place] to put attributes for those instances.” Naively attaching them to modules won’t work. I think this demonstration hints that parameterization gets you there. It’s a question of, as you bring up next, efficiency, performance, scalability, etc. (In essence, is this a good IR architecture or not.)

This is a cool observation. I’d expect sparsity of per-instance annotations, but this performance angle matters if they are dense. The hierarchical path stuff is really just a mechanism to generate unique IDs which has some interesting properties involving grouping/inlining. Using a map for better sparse performance and actual IDs from a tree traversal might make sense:

firrtl.circuit "Top" {
  firrtl.module @Bar(id: Int) {
    %x = firrtl.wire attributes {annotations={6 =[{“hello”}]}} : !firrtl.uint<1>
  firrtl.module @Foo(id: Int) {
    %c = firrtl.instance @Bar(id+1): (Int) -> ()
    %d = firrtl.instance @Bar(id+2): (Int) -> ()
  firrtl.module @Top(id: Int) {
    %a = firrtl.instance @Foo(id+1): (Int) -> ()
    %b = firrtl.instance @Foo(id+4): (Int) -> ()

Listing 6.1

I admit that the space of implementations should be examined. The hierarchical name approach is heavily biased by the Scala FIRRTL Compiler represenation of these types of annotations (and Verilog hierarchical paths) where a “path” of instance names gives you something unique.

Alternatively, I expect we’ll need to tackle parameterization eventually and this hints that per-instance parameters is the same problem as parameterization.

I may have overemphasized the inlining and grouping aspects. :sweat_smile:

What I really want is to: (1) unify information in one IR representation (though multiple dialects is fine) and (2) I’d like to be able to make changing the circuit (writing a pass) as easy as possible.

(2) requires some explanation. I’d like to make the following modifications easy:

  1. Deleting an operation
  2. Renaming an operation
  3. Splitting an operation
  4. Folding an operation

When the metadata is co-located with the operation, the transform writer can make a judgement call about what to do with it. The transform writer understands what their pass is doing and how metadata should be propagated. Putting that metadata anywhere else makes me very nervous.

The Scala FIRRTL Compiler approach, with it’s alternative data structure, requires that every transform provide a RenameMap that “explains” what it did. Examples of such “explanations” are:

  • Deleted wire x in Foo
  • Renamed module Bar to Baz
  • Split vector Qux to Qux_0 and Qux_1

There’s then an annoyingly complex algorithm for applying the RenameMap to the annotations data structure. (Annoying complexity comes from dealing with chained application of “explanations” and when the pass communicates that it updated one part of one partition of instances.)

I admit that this is lumping all solutions that use a separate data structure as inheriting the weirdness of the SFC design decisions. (Obviously, this isn’t the case. An improved design with a separate datastructure likely exists.) That said, I’ve posited that the separate datastructure architecture arose because of a lack of parameterization (and that these are really the same problem.) Furthermore, the SFC complexities at least suggest that separate datastructure approaches warrant caution.

1 Like

I’m gonna move on to placement stuff this week. I have two options: do it in a hacky way or prototype this proposal. @clattner I’m pretty sure I know the answer to this, but are we ready to go down the road of parameterized modules? I might be willing to prototype this all (in the msft dialect), but I suspect I’ll need some limited support in ExportVerilog – like ignoring stuff it doesn’t recognize (e.g. instance metadata) – so I don’t have to remove it before I run ExportVerilog (which could be useful for more than just this).

Not sure I’m wanting to do this vs. doing something kinda hacky.