Dropping debug info for base classes with pure virtual functions

(+llvmdev)

(+llvmdev)

Hi David,

I just bisected a debug info problem we were experiencing down to your legendary commit:
  Revert "Revert "Revert "Revert "DebugInfo: Omit debug info for dynamic classes in TUs that do not have the vtable for that class"”""

For the following testcase, we don’t emit any debug info whatsoever for the base class A (and only a forward decl for B, but that’s not relevant to my particular problem).

The "we only emit a forward declaration of B" is sort of relevant to this issue, because without a definition of "B" we have no reason to even emit any reference to "A" at all.

> class A
> {
> virtual bool f() = 0; // This line is the culprit.
> int lost;
> };
> class B : public A
> {
> B *g();
> };
> B *B::g() {}
>

My question is: given the commit message — is this intended behavior (based on the assumption that a full definition of A will be in a different module *) or would you consider this a bug, too?

This is intended behavior.

*) This assumption does actually not hold for our problem, but I will spare the gory details for now.

It might not hold because you haven't built every part of your program with debug info - yes, that is a known limitation.

If it is intended behavior we can try to fix it only in the -fno-limit-debug-info path.

Please refer to the previous thread where Manman brought up a similar concern and Eric, Manman, and myself discussed it - it might be worth picking up from that conversation rather than restarting from scratch.

Personally I believe it better not be intended (we still do emit debug info if the virtual function is commented out),

That we do emit debug info if the virtual function is commented out doesn't indicate that this is a bug.

The basic premise for the optimization is that if the type is used, its vtable must be emitted /somewhere/ so we only emit the debug info for the type where we emit the vtable information for the type. GCC already implements this optimization, so even if you compile half your code with Clang and half with GCC (or, say, use a GCC-built library such as the standard library) this would still work. It won't work if you compile half your program without debug info.

So that's the premise. This has a hard justification when we're dealing with a concrete type - in C++ all non-pure virtual functions must be defined somewhere in the program. Any program that doesn't meet this requirement has undefined behavior.

Thanks for the clarification!

We’re trading lookup time for DWARF size, and this works based on the assumption that all modules in the project are built with debug info.
So if, e.g., a user wants to inherit a class defined in a 3rd-party C++ library that comes prebuilt without debug info, they now can’t inspect the base class in the debugger any more?
This sounds like there should be the option to turn off this optimization for this scenario.

But this isn't true of pure virtual functions such as your example. So while it's not a C++ undefined behavior issue, it's still a practical one. Once the program actually has a use of these abstract base classes, their vtables will be emitted and so will the debug info. Until then you have no use of the type and thus no need for the debug info for it.

I see, that is consistent with the above premise.

-- adrian

> (+llvmdev)
>
>
> Hi David,
>
> I just bisected a debug info problem we were experiencing down to your
legendary commit:
> Revert "Revert "Revert "Revert "DebugInfo: Omit debug info for dynamic
classes in TUs that do not have the vtable for that class"”""
>
> For the following testcase, we don’t emit any debug info whatsoever for
the base class A (and only a forward decl for B, but that’s not relevant to
my particular problem).
>
> The "we only emit a forward declaration of B" is sort of relevant to
this issue, because without a definition of "B" we have no reason to even
emit any reference to "A" at all.
>
> > class A
> > {
> > virtual bool f() = 0; // This line is the culprit.
> > int lost;
> > };
> > class B : public A
> > {
> > B *g();
> > };
> > B *B::g() {}
> >
>
> My question is: given the commit message — is this intended behavior
(based on the assumption that a full definition of A will be in a different
module *) or would you consider this a bug, too?
>
> This is intended behavior.
>
> *) This assumption does actually not hold for our problem, but I will
spare the gory details for now.
>
> It might not hold because you haven't built every part of your program
with debug info - yes, that is a known limitation.
>
> If it is intended behavior we can try to fix it only in the
-fno-limit-debug-info path.
>
> Please refer to the previous thread where Manman brought up a similar
concern and Eric, Manman, and myself discussed it - it might be worth
picking up from that conversation rather than restarting from scratch.
>
> Personally I believe it better not be intended (we still do emit debug
info if the virtual function is commented out),
>
> That we do emit debug info if the virtual function is commented out
doesn't indicate that this is a bug.
>
> The basic premise for the optimization is that if the type is used, its
vtable must be emitted /somewhere/ so we only emit the debug info for the
type where we emit the vtable information for the type. GCC already
implements this optimization, so even if you compile half your code with
Clang and half with GCC (or, say, use a GCC-built library such as the
standard library) this would still work. It won't work if you compile half
your program without debug info.
>
> So that's the premise. This has a hard justification when we're dealing
with a concrete type - in C++ all non-pure virtual functions must be
defined somewhere in the program. Any program that doesn't meet this
requirement has undefined behavior.
>

Thanks for the clarification!

We’re trading lookup time for DWARF size, and this works based on the
assumption that all modules in the project are built with debug info.
So if, e.g., a user wants to inherit a class defined in a 3rd-party C++
library that comes prebuilt without debug info, they now can’t inspect the
base class in the debugger any more?

That's correct.

This sounds like there should be the option to turn off this optimization
for this scenario.

Possibly - I've bumped the original review thread for this commit and cc'd
you so you can see the backstory/previous discussion that was had with
Manman a few months ago. Let's continue the discussion there, if necessary.
(not sure if you have the whole thread in your email - you might need to
consult an archive for all the various tangets, etc, if you don't keep all
your old email)