I’m looking at cases of disagreement between assertions in the backend and what the Verifier checks for (for http://reviews.llvm.org/D16083). One of these seems to be the question of whether a DISuprogram may exist without a DICompileUnit. Currently the Verifier doesn’t care, but there are a few assertions to this extent in the backend.
Possible options are:
- A DISubprogram may exist without a DICompileUnit (what would we have to change in the backend?)
- A DISubprogram’s scope may be null, but only if there’s a DICompileUnit that references it.
- A DISubprogram’s scope may not be null, but the referenced DICompileUnit need not necessarily contain it
- A DISubprogram’s scope may not be null, and the referenced DICompileUnit must contain it
As far as I can tell, all of these would need some amount of changes to either existing code or existing test cases.
Looking at this again, I think I was under the mistaken impression that the DISubprogram’s scope would have to be the compile unit, but it seems we also currently allow it to be a DIFile (or any other kind of scope). In that case, I suppose the behavior that the backend expects is that every DISubprogram is referenced from some DICompileUnit contained in llvm.dbg.cu. I’ll implement a Verifier pass to that extent.
Looking at this again, I think I was under the mistaken impression that the DISubprogram's scope would have to be the compile unit, but it seems we also currently allow it to be a DIFile (or any other kind of scope). In that case, I suppose the behavior that the backend expects is that every DISubprogram is referenced from some DICompileUnit contained in llvm.dbg.cu. I'll implement a Verifier pass to that extent.
Two things I remember:
1. There's difference between subprogram definitions and subprogram declarations. Declarations seem to just be part of the type hierarchy.
2. Subprogram definitions need to be referenced from the compile units.
IMO this link (2) should be reversed: we should reference compile units from subprogram definitions (through a new cu: field) and drop the reference in the other direction. This would be far more convenient for LTO (or llvm-link in general), but it causes a semantic change and I never worked through the implications of that.
>
> Looking at this again, I think I was under the mistaken impression that
the DISubprogram's scope would have to be the compile unit, but it seems we
also currently allow it to be a DIFile (or any other kind of scope). In
that case, I suppose the behavior that the backend expects is that every
DISubprogram is referenced from some DICompileUnit contained in
llvm.dbg.cu. I'll implement a Verifier pass to that extent.
Two things I remember:
1. There's difference between subprogram definitions and subprogram
declarations. Declarations seem to just be part of the type hierarchy.
2. Subprogram definitions need to be referenced from the compile units.
IMO this link (2) should be reversed: we should reference compile units
from subprogram definitions (through a new cu: field) and drop the
reference in the other direction. This would be far more convenient for
LTO (or llvm-link in general), but it causes a semantic change and I never
worked through the implications of that.
I'd probably be OK with that.
Though I will note that I don't /think/ (2) is true right now, and I think
we actually violate (2) intentionally, perhaps.
Diego: To include line/col info for backend diagnostics, we produced debug
info but omitted the llvm.dbg.cu entry, right? So there are subprogram
debug info descriptions that are not referenced from a CU in llvm.dbg.cu,
yes?
Yes, that and for sample PGO. Omitting llvm.dbg.cu prevents codegen from
emitting all that debug info to the final binary.
In the patch comments I suggested adding a separate named metadata node to root compile units that you want in the IR, but not emitted into the binary. I assume that would work for you?
Duncan, do you like that approach?
Absolutely. It's, in fact, preferable. The current approach is fairly
hacky.