Why does a DISubprogram need to be distinct?

I saw
https://reviews.llvm.org/rGe17f52d623cc146b7d9bf5a2e02965043508b4c4#949392
"Add a verifier check that rejects non-distinct DISubprogram function"
(which has currently be reverted)
today which made me wonder why a DISubprogram needs to be distinct?

David told me the following and encouraged me to ask in the upstream:

May be historical at this point, I'm not quite sure. (I remember there being some complicated issues around distinctness of nodes when it came to inlining that could cause two inlined functions to end up smooshed together and appear as one inlined function - but I think that was about the distinctness of the debugloc itself, not of the function).

Hmm, maybe it's from the days when DISubprogram ownership was inverted - DISubprograms used to be owned by the DICompileUnit and the DISubprogram would refer to the llvm::Function rather than the way it is now (llvm::Function refers to DISubprogram, DISubprogram refers to DICOmpileUnit). In that old way, deduplicating a DISubprogram when IR linking would have resulted in two DICompileUnits both having the DISubprogram in their subprogram list - which would break a bunch of other stuff.

So /maybe/ not needed anymore? (given that the DISubprogram refers to the DICompileUnit which is distinct I can't see how the DISubprogram could have problems being non-distinct - but I might be missing something)

I re-generated DISubprogram-v4.ll.bc with a llc 7.0.0 in 7d0556fc137aa07347741b7750e50ecbc2b4c6e2 to fix the built bots but I am not sure I did the right thing.

* If old clang did not produce non-distinct DISubprogram, then even with the verifier I think the bitcode compatibility is still retained.
* If old clang did produce distinct DISubprogram, this seems like a bitcode compatibility break due to a verifier?

First — thanks for fixing the test for me!

We have two kinds of DISubprograms:
- uniqued DISubprograms are part of the type system and describe function *types*. They don't have a unit: field, because they want to be uniqued across compile units.
- distinct DISubprograms describe one specific function definition. They belong to a compile unit and thus have a unit: field.

You are only supposed to attach distinct DISubprograms to function definitions. Several places in CodeGen will unconditionally dereference the unit: field and crash if it isn't there.

We actually had a weaker form of the check I added to Verifier already in place (https://github.com/llvm/llvm-project/blob/51cad041e0cb26597c7ccc0fbfaa349b8fffbcda/llvm/lib/IR/Verifier.cpp#L1186), but it didn't cover the specific testcase I added in https://reviews.llvm.org/rGe17f52d623cc146b7d9bf5a2e02965043508b4c4.

-- adrian

First — thanks for fixing the test for me!

I'm a bit curious about the test - any idea how it came to be, if it's
invalid? Did we produce such bitcode in the past and don't anymore -
what's the rule about backwards compatibility here, then? It seems
like any time we regenerate a bitcode compat test that's questionable
because it means we won't be compatible with that bitcode we said we
should be compatible with?

We have two kinds of DISubprograms:
- uniqued DISubprograms are part of the type system and describe function *types*. They don't have a unit: field, because they want to be uniqued across compile units.
- distinct DISubprograms describe one specific function definition. They belong to a compile unit and thus have a unit: field.

You are only supposed to attach distinct DISubprograms to function definitions. Several places in CodeGen will unconditionally dereference the unit: field and crash if it isn't there.

But do they have to be distinct? They have different fields (as you
say, declaration DISubprograms don't have a "unit:" field, definition
DISubprograms do have a unit: field that refers to a (itself distinct)
DICompileUnit)

First — thanks for fixing the test for me!

I’m a bit curious about the test - any idea how it came to be, if it’s
invalid? Did we produce such bitcode in the past and don’t anymore -
what’s the rule about backwards compatibility here, then? It seems
like any time we regenerate a bitcode compat test that’s questionable
because it means we won’t be compatible with that bitcode we said we
should be compatible with?

I think I mentioned in the commit message — we had a short period in clang with a bug

Clang right before https://reviews.llvm.org/D79967 would generate this kind of broken IR.

and the swift-5.3 branch happened to be cut in that unlucky window. To trigger it you need to markup a function with NoDebug (the bug was introduced maybe a month earlier).

We have two kinds of DISubprograms:

  • uniqued DISubprograms are part of the type system and describe function types. They don’t have a unit: field, because they want to be uniqued across compile units.
  • distinct DISubprograms describe one specific function definition. They belong to a compile unit and thus have a unit: field.

You are only supposed to attach distinct DISubprograms to function definitions. Several places in CodeGen will unconditionally dereference the unit: field and crash if it isn’t there.

But do they have to be distinct? They have different fields (as you
say, declaration DISubprograms don’t have a “unit:” field, definition
DISubprograms do have a unit: field that refers to a (itself distinct)
DICompileUnit)

IIRC, DIBuilder lets you either create a distinct one with a unit or a uniqued one without. I don’t know if the backend would survive the same uniqued DISubprogram node being attached to multiple llvm::Functions or if that would invalidate some assumptions… In any case distinct seems to be a reasonable proxy for what we mean.

– adrian

First — thanks for fixing the test for me!

I’m a bit curious about the test - any idea how it came to be, if it’s
invalid? Did we produce such bitcode in the past and don’t anymore -
what’s the rule about backwards compatibility here, then? It seems
like any time we regenerate a bitcode compat test that’s questionable
because it means we won’t be compatible with that bitcode we said we
should be compatible with?

I didn’t answer the bitcode part: Adding the Verifier check is the bitcode compatibility. If an AssertDI fires, the debug info is stripped from the CU with a warning about invalid debug info. We don’t need be implement a bitcode upgrade for a buggy version of clang, particularly not in this case, since I also cherry-picked the bugfix to that same branch.

– adrian

First — thanks for fixing the test for me!

I’m a bit curious about the test - any idea how it came to be, if it’s
invalid? Did we produce such bitcode in the past and don’t anymore -
what’s the rule about backwards compatibility here, then? It seems
like any time we regenerate a bitcode compat test that’s questionable
because it means we won’t be compatible with that bitcode we said we
should be compatible with?

I think I mentioned in the commit message — we had a short period in clang with a bug

Clang right before https://reviews.llvm.org/D79967 would generate this kind of broken IR.

and the swift-5.3 branch happened to be cut in that unlucky window. To trigger it you need to markup a function with NoDebug (the bug was introduced maybe a month earlier).

(snipped from other email):

I didn’t answer the bitcode part: Adding the Verifier check is the bitcode compatibility. If an AssertDI fires, the debug info is stripped from the CU with a warning about invalid debug info. We don’t need be implement a bitcode upgrade for a buggy version of clang, particularly not in this case, since I also cherry-picked the bugfix to that same branch.

Ah, OK, all makes sense. So invalid DWARF isn’t invalid IR/rejected, it’s just dropped (knew that on some level, but hadn’t put it all together properly - appreciate you covering it), so some quirky IR in the past we can accept is buggy rather than just old and drop rather than handle.

We have two kinds of DISubprograms:

  • uniqued DISubprograms are part of the type system and describe function types. They don’t have a unit: field, because they want to be uniqued across compile units.
  • distinct DISubprograms describe one specific function definition. They belong to a compile unit and thus have a unit: field.

You are only supposed to attach distinct DISubprograms to function definitions. Several places in CodeGen will unconditionally dereference the unit: field and crash if it isn’t there.

But do they have to be distinct? They have different fields (as you
say, declaration DISubprograms don’t have a “unit:” field, definition
DISubprograms do have a unit: field that refers to a (itself distinct)
DICompileUnit)

IIRC, DIBuilder lets you either create a distinct one with a unit or a uniqued one without.

I’m with you on the implementation, but trying to understand the justification.

I don’t know if the backend would survive the same uniqued DISubprogram node being attached to multiple llvm::Functions or if that would invalidate some assumptions… In any case distinct seems to be a reasonable proxy for what we mean.

Right, so maybe the clearer question would be: What would go wrong if DISubprogram wasn’t distinct.

I don’t think distinct stops a DISubprogram from being referred to from multiple llvm::Functions at the same time - it just stops the IR linker from deduplicating the DISubprograms when IR linking, I think? But since the DISubprograms refer to the DICompileUnit which is distinct, so the DISubprograms wouldn’t be the same (they’d be referring to different DICompileUnits) so they wouldn’t be deduplicated across IR modules when IR linking.

Perhaps marking them as distinct saves some link time by not requiring the IR Linker from /trying/ to merge them? So the generalization of that observation would be “any IR that refers to a distinct node can and should (but doesn’t have to be) distinct itself to simplify/speed up IR linking, because it can’t ever be deduplicated anyway”?