Should it be legal for two functions to have the same !dbg attachment?

I just wrote an IR Verifier check that catches the following situation:

  ; RUN: not llvm-as %s -disable-output 2>&1 | FileCheck %s

  define void @f1() !dbg !4 {
    unreachable
  }

  ; CHECK: DISubprogram attached to more than one function
  define void @f2() !dbg !4 {
    unreachable
  }

  !llvm.dbg.cu = !{!1}
  !1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2)
  !2 = !DIFile(filename: "t.c", directory: "/path/to/dir")
  !3 = !{}
  !4 = distinct !DISubprogram(name: "foo", scope: !1, file: !2, unit: !1)

Why? When two functions share the same DISubprogram attachment, the DWARF backend gets very confused and adds all attributes from all functions to the same subprogram DIE, so we end up with something looking like:

  DW_TAG_subprogram
     DW_AT_low_pc // f1
     DW_AT_high_pc
     DW_AT_low_pc // f2
     DW_AT_high_pc
     ... etc.

Seeing this it seemed obvious to me we should disallow this by preventing two functions from sharing the same !dbg attachment. After implementing my new verifier check (and updating a couple of hand-crafted testcases that violated it) I noticed that the new check causes the cloning unit test to fail:

When CloneFunction clones a function into the same module, it will not create a deep copy of the function's debug info. Before I start fixing this (which is a bit of work, because we will need to re-parent all of the function's DILocations into the cloned DISubprogram) I wanted to get everyone's opinion on whether this is the right approach.

tl;dr: Basically, when invoking CloneFunctionInto on:

  define void @f() !dbg !3 {
    ret void, !dbg !4
  }

  define void @f_clone()

  !llvm.dbg.cu = !{!1}
  !1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2)
  !2 = !DIFile(filename: "t.c", directory: "/path/to/dir")
  !3 = distinct !DISubprogram(name: "foo", scope: !1, file: !2, unit: !1)
  !4 = !DILocation(line: 1, scope: !3)

I would like the result to be:

  define void @f() !dbg !3 {
    ret void, !dbg !4
  }

  define void @f_clone() !dbg !5 {
    ret void, !dbg !6
  }
  !llvm.dbg.cu = !{!1}
  !1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2)
  !2 = !DIFile(filename: "t.c", directory: "/path/to/dir")
  !3 = distinct !DISubprogram(name: "foo", scope: !1, file: !2, unit: !1)
  !4 = !DILocation(line: 1, scope: !3)
  !5 = distinct !DISubprogram(name: "foo", scope: !1, file: !2, unit: !1)
  !6 = !DILocation(line: 1, scope: !5)

Thoughts?

-- adrian

Should we rename it to match the renamed actual function name? So there’s some chance the debugger can identify each one separately, even if the user never wrote it that way?

I’m not really sure what a good debugging experience is going to look like if a function is cloned, but I do know (from the other thread about DWP/DWO/Fission/LTO) that having two subprograms with the same name in the same scope doesn’t do good things in GDB - not sure if it could be made to do so either…

Should we rename it to match the renamed actual function name? So there’s some chance the debugger can identify each one separately, even if the user never wrote it that way?

I’m not really sure what a good debugging experience is going to look like if a function is cloned, but I do know (from the other thread about DWP/DWO/Fission/LTO) that having two subprograms with the same name in the same scope doesn’t do good things in GDB - not sure if it could be made to do so either…

It might be reasonable to rename the linkage name to match the cloned function’s name. Renaming the human-readable name (to what?) could create an unpleasant experience if the debugger if the debugger can’t lookup the function by name any more (e.g., when setting a breakpoint by name, I would expect to add one in every cloned version of the function).

Currently the only user of CloneFunction in the regular LLVM pipeline appears to be PartialInlining, which creates a temporary clone of the function that will then be inlined. With this use-case i mind, we don’t really need to bother with renaming the function at all, since the clone will not survive as a standalone function.

Since I haven’t heard any objections to the general idea of my new Verifier check, I’ll go ahead and prepare patch to CLoneFunction to satisfy the new constraints. I’ll create phabricator review once I’m done, in case we want to do more bikeshedding of what to do to function names etc.

thanks for the feedback!
– adrian

tl;dr: Basically, when invoking CloneFunctionInto on:

define void @f() !dbg !3 {
ret void, !dbg !4
}

define void @f_clone()

= !{!1}
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2)
!2 = !DIFile(filename: “t.c”, directory: “/path/to/dir”)
!3 = distinct !DISubprogram(name: “foo”, scope: !1, file: !2, unit: !1)
!4 = !DILocation(line: 1, scope: !3)

I would like the result to be:

define void @f() !dbg !3 {
ret void, !dbg !4
}

define void @f_clone() !dbg !5 {
ret void, !dbg !6
}
= !{!1}
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2)
!2 = !DIFile(filename: “t.c”, directory: “/path/to/dir”)
!3 = distinct !DISubprogram(name: “foo”, scope: !1, file: !2, unit: !1)
!4 = !DILocation(line: 1, scope: !3)
!5 = distinct !DISubprogram(name: “foo”, scope: !1, file: !2, unit: !1)
!6 = !DILocation(line: 1, scope: !5)

Thoughts?

Seems reasonable to me.

-eric