[DebugInfo] Bugs when emitting debug info with inlined functions

Hi all,

In recent weeks I’ve been looking into fixing a few bugs in the Dwarf emitted by LLVM related to when functions are inlined.

Here is a concrete example from https://bugs.llvm.org/show_bug.cgi?id=52159

namespace ns {
inline __attribute__((always_inline))
void foo() { int a = 4; }
}

void goo() {
using ns::foo;
foo();
}

This produces an imported declaration DIE that references an empty SP DIE even though there already exists one for foo.

// Abstract origin
0x0000002f: DW_TAG_subprogram
DW_AT_linkage_name ("_ZN2ns3fooEv")
DW_AT_name ("foo")
// Empty concrete
0x00000047: DW_TAG_subprogram
0x00000048: NULL
// Import declaration
0x00000069: DW_TAG_imported_declaration
DW_AT_import (0x00000047)

One fix is to reference the abstract origin DIE.

0x00000069: DW_TAG_imported_declaration
DW_AT_import (0x0000002f)

Another fix is to do what gcc does and fill out a specification DIE that the import references.

// Import declaration
0x0000004f: DW_TAG_imported_declaration
DW_AT_import (0x00000063)
// Specification
0x00000063: DW_TAG_subprogram
DW_AT_name ("foo")
DW_AT_declaration (true)
// Abstract origin
0x00000070: DW_TAG_subprogram
DW_AT_specification (0x00000063 "_ZN2ns3fooEv")

Since I’m relatively new to debug info, I thought I’d ask some questions on the mailing list.

  1. A simple solution to this class of bugs is to reference the abstract origin SP DIE where appropriate. Do we have any rules for when to reference the abstract origin if it exists?
  2. Would it be better to follow gcc’s solution and fill out specifications in these cases?
  3. Are they more known cases/bugs to consider?

Thanks, Ellis

Hi all,

In recent weeks I’ve been looking into fixing a few bugs in the Dwarf emitted by LLVM related to when functions are inlined.

Here is a concrete example from https://bugs.llvm.org/show_bug.cgi?id=52159

namespace ns {
inline __attribute__((always_inline))
void foo() { int a = 4; }
}

void goo() {
using ns::foo;
foo();
}

This produces an imported declaration DIE that references an empty SP DIE even though there already exists one for foo.

// Abstract origin
0x0000002f: DW_TAG_subprogram
DW_AT_linkage_name ("_ZN2ns3fooEv")
DW_AT_name ("foo")
// Empty concrete
0x00000047: DW_TAG_subprogram
0x00000048: NULL
// Import declaration
0x00000069: DW_TAG_imported_declaration
DW_AT_import (0x00000047)

One fix is to reference the abstract origin DIE.

0x00000069: DW_TAG_imported_declaration
DW_AT_import (0x0000002f)

Another fix is to do what gcc does and fill out a specification DIE that the import references.

// Import declaration
0x0000004f: DW_TAG_imported_declaration
DW_AT_import (0x00000063)
// Specification
0x00000063: DW_TAG_subprogram
DW_AT_name ("foo")
DW_AT_declaration (true)
// Abstract origin
0x00000070: DW_TAG_subprogram
DW_AT_specification (0x00000063 "_ZN2ns3fooEv")

Since I’m relatively new to debug info, I thought I’d ask some questions on the mailing list.

  1. A simple solution to this class of bugs is to reference the abstract origin SP DIE where appropriate. Do we have any rules for when to reference the abstract origin if it exists?

That’s the difficult problem here - the abstract origin only exists if there’s at least one instance of the function being inlined - and we don’t know that until we’ve processed all the functions. The way this was/is currently implemented is to create the concrete definition subprogram when we see the concrete function definition, but not fill out the attributes that would be inherited from an abstract origin if there was one - then wait until we get to the end of the module and if we’ve created an abstract origin (because an inlined instance was found/produced), then we add the abstract_origin attribute to the concrete definition subprogram, and if we haven’t created an abstract origin (because there were no inlined instances) then we put the attributes (name, decl file/line/etc) on the concrete definition without creating an abstract origin.

For these other cases (variables, imported declarations, maybe function local types too?) - I think, at least consistent with the current design, we have to defer producing them until we get to the end - the same as for the name/decl file/line attributes, etc.

Though deferring them presents other challenges - we can’t reference them if they’re deferred, so if some other debug info like the type of a function parameter has to be produced, how do we produce that if we have deferred the type production?

We can’t/shouldn’t/it’s difficult to create the DIEs but to defer attaching those DIEs to the DIE tree - because there’s some logic that uses the DIE parent chain/which CU a DIE is in to determine certain issues of encoding (whether CU-local references can be used or the like). Maybe we can get rid of that constraint (at which point we could create DIEs but defer adding them to the DIE tree) but if I recall correctly it’s pretty deeply embedded/important.

  1. Would it be better to follow gcc’s solution and fill out specifications in these cases?

I guess that doesn’t address the function-local static variable case, or the case of an imported declaration being inside a subprogram? (we could in theory put those in a declaration DIE, but it’d probably be extra confusing to consumers)

  1. Are they more known cases/bugs to consider?

Function local types /probably/ have a similar problem. Maybe call_site tags too? (since they reference function declarations to say which function the call_site is calling) Perhaps some other stuff - but those are ones that come to my mind at least.

Thanks for looking into this - sorry it’s all a bit thorny, though.

  • Dave

For these other cases (variables, imported declarations, maybe function local types too?) - I think, at least consistent with the current design, we have to defer producing them until we get to the end - the same as for the name/decl file/line attributes, etc.

Instead of deferring those DIEs, I wonder if we can do some kind of fixup. At the end of the module, we can search through the SPs with abstract origins to see if they have a concrete DIE with children (variables, types, imported declarations). Then move those DIEs to the abstract origin. This would be a simple solution because we can just assume the concrete function will exist and never be inlined, but at the end we fix the inlined cases. I suspect this solution won’t work for the same reason we don’t want to create DIEs without parents, but let me know what you think of this.

Function local types /probably/ have a similar problem. Maybe call_site tags too? (since they reference function declarations to say which function the call_site is calling) Perhaps some other stuff - but those are ones that come to my mind at least.

I just tested function local types and they do have the same problem.

attribute((always_inline)) inline
int foo() { struct S {int a;}; S s; return s.a; }
int bar() { return 1 + foo(); }

I don’t think call site tags have the problem because if they exist then the function was not inlined in that location so there must be a concrete definition.

Thanks, Ellis

For these other cases (variables, imported declarations, maybe function local types too?) - I think, at least consistent with the current design, we have to defer producing them until we get to the end - the same as for the name/decl file/line attributes, etc.

Instead of deferring those DIEs, I wonder if we can do some kind of fixup. At the end of the module, we can search through the SPs with abstract origins to see if they have a concrete DIE with children (variables, types, imported declarations). Then move those DIEs to the abstract origin. This would be a simple solution because we can just assume the concrete function will exist and never be inlined, but at the end we fix the inlined cases. I suspect this solution won’t work for the same reason we don’t want to create DIEs without parents, but let me know what you think of this.

/maybe/ though I’m not sure the DIE APIs support removing children, do they?

Function local types /probably/ have a similar problem. Maybe call_site tags too? (since they reference function declarations to say which function the call_site is calling) Perhaps some other stuff - but those are ones that come to my mind at least.

I just tested function local types and they do have the same problem.

attribute((always_inline)) inline
int foo() { struct S {int a;}; S s; return s.a; }
int bar() { return 1 + foo(); }

Thanks, good to confirm!

I don’t think call site tags have the problem because if they exist then the function was not inlined in that location so there must be a concrete definition.

Fair point!

  • Dave

/maybe/ though I’m not sure the DIE APIs support removing children, do they?

No the API doesn’t support removing children or changing the parent node. I don’t see anything obvious reason why we can’t add this API, but it might be too disruptive for what we are trying to accomplish.

Ellis

Could be worth a shot - I think the attribute removal would be more difficult due to abbreviations, etc (maybe the same is true of children, when setting the CHILDREN property of the abbrev, I’m not sure) - worth a go, at least. Can write down why it’s not feasible if you/we discover some reason that’s infeasible.

Sure, I’ll try it out!

I’m not sure if we will need to remove attributes? We just need to move children from the SP DIE and possibly remove the whole SP DIE if it’s empty.

Sure, I’ll try it out!

I’m not sure if we will need to remove attributes? We just need to move children from the SP DIE and possibly remove the whole SP DIE if it’s empty.

I don’t think we’ll need to remove attributes - we’ll keep doing the thing the code currently does, delaying adding the name/file/line attributes to the SP DIE until the end of the module then deciding where to put them based on whether there’s an abstract origin or not.

I’ve just uploaded https://reviews.llvm.org/D111916. I haven’t seen any problems so far and it seems to solve all the bugs I am aware of. Let me know what you think.

Ellis

I have another way to fix these bugs in https://reviews.llvm.org/D112337 and it is much simpler. Could someone please take a look?

I’ll make sure it’s in my queue - thanks!