Handling Cyclic Dependencies in Debug Info

We recently started with importing non-toy programs from LLVMIR into MLIR’s LLVM dialect and came to the realization that LLVM IR’s debug metadata can have cyclic dependencies.

An example of such a cyclic dependencies is a struct that has a pointer to itself, such as a linked list node:

struct node {
 int val;
 struct node *next;
};

This structure results in the following LLVM IR debug info metadata (simplified):

!16 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "node", elements: !17, identifier: "_ZTS4node")
!17 = !{!18, !19}
!19 = !DIDerivedType(tag: DW_TAG_member, name: "next", baseType: !20)
!20 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16)

which has two back references from the derived types !19 and !20, which represent the next member variable and its type, to the composite type !16, which represents the node. In particular, 20! describes the node pointer type resulting in the cycle !16 → !17 → !19 → !20 → !16.

At the moment, MLIR LLVM dialect represents metadata using attributes that reference each other. If we remove the next member from the struct, the import produces the following LLVM dialect attributes (simplified):

#val_type = #llvm.di_derived_type<
  tag = DW_TAG_member, name = "val", baseType = #di_basic_type
>
#node_type = #llvm.di_composite_type<
  tag = DW_TAG_structure_type, name = "node", elements = #val_type
>

Unfortunately, things stop working once we try to import the full struct due to the cyclic dependencies discussed above. Essentially, there is no topological sort / sequential construction order for the debug info attributes anymore.

We thus need a way to break the cyclic dependencies in the debug info representation. At the moment, our working assumption is these cycles mostly show up for composite types, but there may be more such instances (let us know if you are aware of more such constructs!). We see currently two ways forward:

  1. Use symbols to break cyclic dependencies. Other MLIR representations of LLVM IR metadata - such as access groups, alias scopes, and soon TBAA metadata - use operations that are referenced by symbols to represent metadata. We could introduce such operations to represent composite types or even more debug metadata constructs.

  2. A less intrusive approach could be to use the fact that composite types have an identifier that is supposed to be unique in the compilation unit. We could use this fact to represent back edges to the composite type using this identifier. While clearly a smaller change, this approach is also much more specific to the problem at hand and it may break if there are other kinds of cyclic dependencies.

Let me give you a bit more context on how these solutions could look like. A symbol based approach (solution 1) could use the existing llvm.metadata operation to specify the debug info related information:

llvm.metadata @__debug_info {
  llvm.di_derived_type @next_type {
    tag = DW_TAG_member, name = "next", baseType = @ptr_type
  }
  llvm.di_composite_type @node_type {
    tag = DW_TAG_structure_type, name = "node", elements = @next_type
  }
  llvm.di_derived_type @ptr_type {
    tag = DW_TAG_pointer_type, base_type = @node_type
  }
}

This solution breaks the cyclic dependencies using symbols. One disadvantage I see over the attribute based approach is that it may be harder to remove stale debug information.

An alternative (solution 2) may be to replace the back reference with a special terminal node that references the composite type using the unique identifier:

#next_type_ref = #llvm.di_composite_type_ref<
  identifier = "_ZTS4node"
>
#ptr_type = #llvm.di_derived_type<
  tag = DW_TAG_pointer_type, base_type = #next_type_ref
>
#node_type = #llvm.di_composite_type<
  tag = DW_TAG_structure_type, name = "node", elements = #next_type, identifier = "_ZTS4node"
>
#next_type = #llvm.di_derived_type<
  tag = DW_TAG_member, name = "next", baseType = #ptr_type
>

This solution breaks the cyclic dependencies using a string reference. It assumes the composite type has a unique identifier, which seemingly is the case for ODR source languages (https://llvm.org/docs/LangRef.html#dicompositetype). During the export the composite type reference would then be replaced by a real back edge pointing to the exported composite type metadata node.

Any thoughts and ideas on these two or a possible third solution are very welcome.

1 Like

There are two other ways to model cyclic dependencies in MLIR, not mentioned here.

  • Make #llvm.di_composite_type a mutable attribute, similar to how !llvm.struct type currently works, that is identified by its identifier. You could then instantiate it with just the identifier name and the later on set its elements, forming a cycle. This is relatively close to your solution 2, although arguably a better treaded path.
  • Similar to solution 1, have you considered a graph region? MLIRs regions/blocks don’t have to follow strict SSA dominance rules. Similar to llvm.metadata, there could be a top level llvm.debug_info op with a graph region. This could eg. then look like:
llvm.debug_info {
  %next_type = llvm.di_derived_type {
    tag = DW_TAG_member, name = "next", baseType = %ptr_type
  }
  %node_type = llvm.di_composite_type {
    tag = DW_TAG_structure_type, name = "node", elements = %next_type
  }
  %ptr_type = llvm.di_derived_type {
    tag = DW_TAG_pointer_type, base_type = %node_type
  }
}

That said, this solution would require the same kind of revamping.

1 Like

@zero9178 Thanks for these ideas!

I did not know that mutable attributes exist and it sounds like they are a strictly better alternative than encoding the back edge using a string. I think this insight makes solution 2 much more attractive!

The example I posted above does only show a subset of the debug metadata usage. One additional aspect, not shown previously, is that we need to reference the debug metadata from the actual program / ir:

// debug metadata definition
llvm.debug_info @ __debug_info {
 llvm.di_derived_type @var_type { ... }
}

// somewhere in the program
llvm.intr.dbg.value llvm.di_local_variable<name="my_var", type = @ __debug_info::var_type> = %arg : i64

The example shows how a debug intrinsic reference a type definition inside the metadata region using a symbol. While a graph region is a great solution for the local dependencies, it does not solve the task of relating debug info to the program / ir? We would thus presumably need a second mechanism to link metadata to the program, which makes me believe using symbols is a better solution.

I started looking into different solutions for the cyclic debug metadata handling and I am currently unsure how to move forward. Above we discussed two main approaches:

  1. Attach the type debug information to an operation embedded in a global metadata operation and reference these operations using symbols to break the cycles.
  2. Use mutable attributes to represent the cyclic dependencies.

I did prototype solution 2) since it is closer to the current debug metadata handling. AFAIK tablegen does not support mutable attributes. The change would therefore produce significant amounts of C++ code for every attribute that has to be mutable. Another issue is that the printer does not replace nested attributes with aliases, which means that nested attributes may be printed multiple times (supposedly due to llvm-project/AsmPrinter.cpp at d94b069a89ec6c54030540c031a1032845bdbac0 · llvm/llvm-project · GitHub).

Solution 1) works with module-level operations but seems more heavy-weight and less scalable than the current solution. For example, dropping unused module-level type information does not happen automatically anymore and has to be done by a module pass. I may be wrong though and the additional cost due to the operation / symbol handling is negligible.

I have a slight preference for solution 1) since it follows the design of the alias scope and access group metadata. I believe it is also more flexible if we have to model additional cyclic dependencies and it does not require C++ attribute definitions.

I also considered other possible solutions:

  1. AFAIK MLIR core currently does not have a way to represent structs and other composite types. It may thus be ok if we do not support these concepts in the debug information as well. We could then stick with the current implementation and simply make sure the import from LLVM IR drops cyclic debug info.
  2. We could change the representation of the debug info to avoid the cyclic dependencies without losing any information. Intuitively, the type debug info should be representable by a tree where a type is then fully defined by the leafs and the root of the tree (c.f., example below). Such a representation deviates quite a bit from the LLVM IR debug metadata representation and makes import and export more complex.

At the moment, LLVM IR represents a composite type using a root node that contains a list of elements that point to the fields of the struct/class composite type. The fields are modeled as derived types that have a back reference to the root node using them. All uses of a type, for example, all variables of that type, then have a reference to the root node only:

#root = di_composite_type<name = "A", elements = #leaf1, #leaf2>

#leaf1 = di_derived_type<name = "field1", scope = #root>

#leaf2 = di_derived_type<name = "field2", scope = #root>

#variable = di_variable<type = #root>

As we cannot model the back edges from the leaf nodes to the root node, we may simply drop the “elements” parameter of the composite type and reconstruct it when translating to LLVM IR. Dropping these edges does not cause an information loss since the leaf nodes still have a scope parameter that points to the composite type, which means the dropped elements parameter can be reconstructed. There is one caveat though, we need to make sure the export can find the leaf nodes to reconstruct the dropped edges. This could be achieved by introducing a new node kind that contains the root and the leaf nodes needed to reconstruct debug information of the type. The cycle free representation thus could look as follows:

#root = di_composite_type<name = "A">

#leaf1 = di_derived_type<name = "field1", scope = #root>

#leaf2 = di_derived_type<name = "field2", scope = #root>

#struct_type = di_type<root = #root, leaf_nodes = #leaf1, #leaf2>

#variable = di_variable<type = #struct_type>

The export to LLVM IR would then walk back from the leaf nodes to the root and reintroduce the dropped edges.

While solution 4) seems attractive for the simple example, it is more complex for real-world examples that have to deal with nested and dependent structs. For example, if two structs have pointers to each other (A->B and A<-B), then both of them need to know the union of A’s and B’s leaf nodes. The advantage of the solution is that it is cycle free (modulo overlooks from my side) and that it could be modeled with non-mutable attributes.

Any thoughts on a preferred or better solution? At the movement, I favor 1) or 4).

@River707 @ftynse

Mutable types were specifically introduced to support structs with cycling references in LLVM dialect struct types - [RFC] First-Party Support for LLVM Types.

We also have a mechanism to print without causing infinite recursion, https://github.com/llvm/llvm-project/blob/ae0c232e46c8d8b4fc3d0e007094db216d7a82f6/mlir/lib/Dialect/LLVMIR/IR/LLVMTypeSyntax.cpp#L54. It should be almost directly reusable for attributes. I don’t know how aliases are involved here.

Mutable attributes are not supported by ODS because there is one upstream client for them. The amount of code (and related testing) necessary for supporting these outweighs the amount of code in direct implementation of the only client (LLVMStructType) in C++. Depending on how many mutable attributes you will need to add, it may be worth supporting them in ODS.

As you might have guessed, my preference specifically for debug info related to self-referencing types is to follow the design of self-referencing types, that is, (2). This simplifies the conceptual mapping between the type and the related debug information.

That being said, I don’t have a complete understanding of how debug info is represented in LLVM IR, so symbols or graph region approach may also work I’m not in favor of a significantly diverging modeling (4) as we would like to keep the translation process simple. This modeling would require us to perform heavy graph manipulations in the translation that will be difficult to test and debug. I’d rather debug boilerplate op definitions.

Thanks for the pointers to the printer!

Mutable types were specifically introduced to support structs with cycling references in LLVM dialect struct types - [RFC] First-Party Support for LLVM Types.

I understand the argument of using identical solutions for similar problems. I was assuming LLVM structs with cyclic dependencies would disappear once there are only opaque pointers. I may be wrong though and cyclic structs are still a thing past the switch to opaque pointers only?

I’m not in favor of a significantly diverging modeling (4)

That is a good point. Then I will go for a solution that tries to import/export the debug metadata as is (most likely using mutable attributes). Luckily, it seems like it is sufficient to make the composite type attribute mutable.

I assume they will, but I’m waiting for those changes to definitively land before reflecting them in MLIR to avoid churn. That being said, I don’t know how this affects debug information. FWIW, debug information would be the place to store the underlying pointee type even if we make it opaque in the IR.

I assume they will, but I’m waiting for those changes to definitively land before reflecting them in MLIR to avoid churn. That being said, I don’t know how this affects debug information. FWIW, debug information would be the place to store the underlying pointee type even if we make it opaque in the IR.

Yes the opaque pointers do not solve the debug info problem. Front-end languages will stick to their own cyclic type representations. I was more thinking that there is less of a need to follow the LLVMStructType design given that it may not be super long lived. That said, I think mutable attributes are the better solution in theory. The scary part was my prototype implementation…

Sorry, I’ve been very out for the past month. Just now really getting back into work mode.

This is a great write up @gysit thanks for doing this!

One of the big factors I think we need to take into consideration here is what the intended use cases are for the metadata. If we need to mutate/update/etc. this metadata, different representations are significantly more annoying to deal with in certain scenarios (the symbol approach could potentially break any non-module level passes that want to change/update metadata).

@gysit Is your prototype using mutable attributes public? If it’s the disgustingness of having to rewrite in C++, we should honestly just update the Attr/Type gen to support that. It was always a TODO eventually, might be worth just adding that now. Happy to look into that if you need it.

– River

2 Likes

I uploaded a draft revision (⚙ D142189 WIP mutable composite type attribute.). The “prototype” is very hacky and misses some functionality (e.g., the attribute is incomplete and printing is suboptimal). Despite the incompleteness the amount of C++ is large compared to the existing tablegen implementation. Especially, since testing of the error handling during parsing will be necessary etc. I thus think it would be very nice to have support for mutable attributes. The question is if DICompositeTypeAttr and LLVMStructType justify the effort. Especially, since LLVMStructType is quite complex and probably hard to fully implement in tablegen.

When doing the prototype, I saw the following pain points:

  1. It is definitely quite a lot of C++ code compared to the few lines of tablegen.
  2. When printing any nested attribute is fully printed even if an alias would exist. I am worried that this bloats the size of the textual representation by quite a bit.
  3. The elements field of DICompositeType is not part of the uniquing anymore. That means I most likely need an extra string identifier (the prototype uses an integer identifier) to ensure composite types that differ only in terms of elements can be represented. This identifier may also be used to print a short form if the attribute references itself. Unfortunately, it seems like I cannot use the name of the DICompositeType since there may be different composite types with the same name.

Let me illustrate some of these points. As mentioned in 2), if I parse

#int0 = #llvm.di_basic_type<
  tag = DW_TAG_base_type, name = "int"
>
#comp1 = #llvm.di_composite_type_mut<
  id = 0, name = "test1", elements = #int0
>

it re-prints as

#int = #llvm.di_basic_type<
 tag = DW_TAG_base_type, name = "int"
>

#comp = #llvm.di_composite_type_mut<
  id = 0, name = "test1", 
  elements = #llvm.di_basic_type<tag = DW_TAG_base_type, name = "int0">
>

with the integer type being printed twice (since it is used in another place in the file). This may be a shortcoming of my implementation since I do not yet implement SubElementAttrInterface. Or, it could be a limitation of the AsmPrinter (llvm-project/AsmPrinter.cpp at d94b069a89ec6c54030540c031a1032845bdbac0 · llvm/llvm-project · GitHub)?

The printing gets more complex if the attribute has self references (point 3)).

#comp = #llvm.di_composite_type_mut<
  id = 0, name = "test0",
  elements = #llvm.di_composite_type_mut<id = 0, name = "test1">
>

The custom print method here needs to detect the type is printed recursively and drop the mutable parts when printing the inner instance of the type. I think it would be much nicer if the alias could be printed instead.

#comp = #llvm.di_composite_type_mut<
  id = 0, name = "test0",
  elements = #comp
>

It can also happen that the alias is unknown if two composite types mutually reference each other. I thus believe the solution of LLVMStructType with a unique identifer StringAttr is the best solution for the problem (c.f. llvm-project/llvmir-types.mlir at 87e345b1bdb76867cc6e9ae59b6dd2633a480d38 · llvm/llvm-project · GitHub) and working with aliases only cannot work.

Admittedly, points 2) and 3) are cosmetic problems and not actual implementation problems.

@gysit are you aware of any ongoing work on cyclic DI types for the LLVM dialect? We ran into a use case for recursive types too so I’m planning on picking this up.

I see that @Dinistro recently added a DistinctAttr to some DI types. One way I see is to do something similar, and add a recursion-ID field to DICompositeType, and then use the same distinct ID in the self-referencing “back edge” (seems more explicit than just the type name). This seems the least intrusive to the rest of the infrastructure.

#struct = di_composite_type<self_id = 0, name = "node", elements: #field>
#field = di_derived_type<baseType: #ptr>
#ptr = di_derived_type<baseType: #struct_self>
#struct_self = di_composite_type<self_id = 0, name = "node">

This of course handles composite type, but if other DI types might be recursive too (as long as it’s legal I think we should support it), we can go one step further with an abstracted DIRecursiveType attr:

#rec = di_recursive_type<self_id = 0, baseType: #struct>
#struct = di_composite_type<elements: #field>
...
#ptr = di_derived_type<baseType: #rec_self>
#rec_self = di_recursive_type<self_id = 0>

This way we don’t have to support recursion individually on DI types. They can just be handled generically. Happy to hear your thoughts.

Nice to see progress on this!

@gysit are you aware of any ongoing work on cyclic DI types for the LLVM dialect? We ran into a use case for recursive types too so I’m planning on picking this up.

The last attempt I am aware of is [mlir] Add support for recursive elements in DICompositeTypeAttr. by waj334 · Pull Request #74948 · llvm/llvm-project · GitHub. It may make sense to get in touch with the author.

Note that there is also work by @zero9178 to print attributes aliases with cyclic dependencies [mlir] Add support for parsing and printing cyclic aliases by zero9178 · Pull Request #66663 · llvm/llvm-project · GitHub. This is not a necessity especially for the suggested solution. However, it may be beneficial for other use case.

This seems the least intrusive to the rest of the infrastructure.

Independent of the problem at hand, I think it may make sense to extend the MLIR infrastructure at some point with proper support for mutable attributes that can have cyclic dependencies. There is not only debug info but also struct types defined by LLVM and SPIRV dialect that could benefit from such features.

I see that @Dinistro recently added a DistinctAttr to some DI types. One way I see is to do something similar, and add a recursion-ID field to DICompositeType, and then use the same distinct ID in the self-referencing “back edge” (seems more explicit than just the type name).

If I understand your solution correctly, you have two instances for a recursive type. One is some sort of forward declaration (#struct_self) that is declared first and then used in the actual type definition to store the back edge. The actual type declaration itself is used everywhere else. I believe I thought about this at the time I introduced the distinct attribute and had some issue in mind. I don’t remember though which probably means it is not an important issue.

I guess there are three use cases to consider. 1) lowerings that create such debug info attributes, 2) the export to LLVM IR, and 3) the import from LLVM IR. Both the import and the export should be able to detect such back edges during the recursive traversal and probably need some extra storage to handle the cases correctly. One thing that is unclear to me and that may make things a bit tricky is if we can give every composite type an ID. Usually it is a good idea to adhere to the LLVM IR implementation there to make sure there is no unexpected uniquing if we import/export LLVM IR. The lowerings probably need some extra logic / sequential execution to avoid the same type translates to different distinct attributes.

Another use case are mutually recursive types. I guess that should work as well by using such a back edge for one of them?

This of course handles composite type, but if other DI types might be recursive too (as long as it’s legal I think we should support it), we can go one step further with an abstracted DIRecursiveType attr:

Having a generic back edge / recursive type attribute sounds like a good idea.

My past experience with LLVM dialect is that staying as close as possible to LLVM semantics is a good idea because sooner or later things break otherwise. A mutable attribute solution would be closer I guess. Your solution definitely sounds much easier to implement though and I do not see any immediate issue with it. I think I would give it a shot!

Thanks for the catch-up! Let me reach out to the author and check his status.

I think given that MLIR has mutable attributes, it could make sense to one day move over to mutable attributes for cyclic attributes to better match llvm. On the other hand, I also see advantages to expressing recursion explicitly for the cases that need it, since it allows native data structures to be “finite” (and generic infra to be simple), and recursion becomes a dialect-specific semantic that is handled explicitly by the dialect (or, a shared utility dialect if common enough). Since this seems reasonable to you too, I’ll give it a shot and we can discuss details in the PR. Happy to extend this to other use cases or move to mutable attributes if we get to that point though.

1 Like

Got sidetracked for a bit but I have this draft ready here. I implemented the importer in this PR. If it looks good I’ll move on to the exporter.

Now that there’s something concrete we can go over, one caveat I noticed as I was going over tests is that there’s now this divide between attributes “inside” the cycle, and those outside. Even though they mean the same thing conceptually, they are no longer physically identical in this encoding, so there will be one version inside the cycle, and one version outside.

For example, if we had a linked list node struct, its entire attribute tree would look like this

#rec_struct = di_recursive_type<self_id = 0, baseType: #struct>
#struct = di_composite_type<self_id = 0, name = "node", elements: #field>
#field = di_derived_type<baseType: #ptr>
#ptr = di_derived_type<baseType: #rec_self>
#rec_self = di_recursive_type<self_id = 0>

where the cycle is rec_struct → struct → field → ptr → rec_self.

However, the inner attribute #ptr is not usable standalone. They must only be used “inside” #rec_struct. If we wanted to express a pointer to the struct type, we’ll have to use a separate attr

#ptr_outer = di_derived_type<baseType: #rec_struct>

I suppose this is to be expected with this kind of “lazy” representation of recursion, since mutually recursive types would naturally need two representations of the same cycle anyway: A → B → A, and B → A → B.
The helper function on recursive types getUnfoldedBaseType should help deal with this in code. It takes the inner type of a recursive type and unfolds it one level, so that we get back a self-contained “outside” version of the body of the cycle.

Looking forward to your thoughts on the PR! :pray:

#struct_self in the example would be #rec_self right?

I think I am a bit surprised there is #rec_self and #rec_struct but I guess looking at the PR will help. I try to get to today (will have a short vacation afterwards)!

ah sry yes you’re right. Let me revise that to not confuse others. Thanks for taking a look!

I did have a bit deeper look at your revision and I think I got a first impression. Is there a recursive type wrapper around every type? At the first glance it looked like it but I probably missed something…

I think compared to the mutable attribute solution this is definitely a better fit with respect to the MLIR core infra and the code complexity is also on the low end (except for the recursion detection / handling).

The downside is probably that it is somewhat unstructured/“lazy” and certain things cannot be verified. I.e., we probably cannot verify there is a rec_struct for a rec_self? A mutable attribute would be a more concise representation that provides some more guarantees. As most transformations will not touch this metadata, it maybe ok we cannot verify everything though. The import/export and the front-end that have to create/consume this metadata probably can verify the correctness during the traversal of the metadata.

I suppose this is to be expected with this kind of “lazy” representation of recursion, since mutually recursive types would naturally need two representations of the same cycle anyway: A → B → A, and B → A → B.
The helper function on recursive types getUnfoldedBaseType should help deal with this in code. It takes the inner type of a recursive type and unfolds it one level, so that we get back a self-contained “outside” version of the body of the cycle.

Yeah I think helper functions such as getUnfoldedBaseType are a helpful tool here. As written above, I would hope this metadata is mainly routed through and rarely transformed. Hence, the fact that manipulating it can go wrong is less of a concern.

I think it makes sense to continue and try to implement at least the export to understand if there are other hurdles/concerns.

Thanks for taking a look at this! I followed the second part of my original proposal and just created a generic recursive type so that any DIType-involved cycles can be represented, not just those that go through composite types.

Is there a recursive type wrapper around every type?

There will be a recursive wrapper at the first entrance into a cycle, which basically cuts off any cycles from reaching back to the entrance node (replaced with a “rec-self” indirection). Of course this depends on where you start (i.e. which node is referenced elsewhere in the IR), e.g. the mutually recursive case.

The downside is probably that it is somewhat unstructured/“lazy” and certain things cannot be verified.

Yeah… Unfortunately I haven’t reached a good solution for verifying the correctness of recursive types. I suppose we can invoke verification for dbg intrinsic ops to make sure the attrs they reference are valid, but I’m not sure we can run this check on the location of every op. Also there will be a lot of overlap in verification, so it might be best run as a verification pass utilizing a cache.

A mutable attribute would be a more concise representation that provides some more guarantees.

Yeah I think that’s the greatest advantage of using mutable attrs. Of course the flip side is we have to make sure everything traversing attributes is cycle-safe. Once we have more mature infra for mutable attrs (e.g. printing/parsing cycles like you mentioned), I’m happy to give it a try.

As most transformations will not touch this metadata, it maybe ok we cannot verify everything though.

Yes I imagine the cyclic types created by frontends/importer would be structurally fixed from then on. Replacers might be run on these attrs to lower some fields or types, but it doesn’t affect the structure. We can run a recursion verification pass immediately after creation.

I think it makes sense to continue and try to implement at least the export to understand if there are other hurdles/concerns.

OK let me go ahead with the export side and see what happens. Thanks!

@gysit Hope you had a nice break! I also went on a vacation for a while, but now I have the exporter implemented in the same PR (and more specifically, recursive composite types. Other llvm DI types that support mutation can be implemented similarly as the need arises). Please take a look when you get a chance. Interested to see what you think :pray:

Nice you are back working on this! I try to have a look at it tonight (European time)!