I’ve been enjoying the use of constructor homing for a while, and it’s extremely effective at reducing the size of .debug_info in things like clang. I saw recently (rG517bbc64dbe4 CC @dblaikie ) that the “old” mode (pre-LLVM-13) of vtable homing is going to go away – I don’t feel this is desirable, I believe the “old” mode delivers a lot of value.
We (the Sony bunch making the PlayStation compiler) have worked through a few issues with partners who’s workflow doesn’t work with constructor homing, causing variables to have incomplete types. The most common issue is serialised classes – an object is written to disk in some format, and is then automatically reconstructed later in memory without actually calling a constructor. This obviously breaks constructor homing, and is a reasonably common design pattern in games. Other issues have involved pointer casting, and constructor homing won’t describe objects returned from closed source libraries. The solution so far has been to turn constructor homing off, and use vtable homing.
I’m not going to argue that there’s some great reason for keeping vtable homing available. Instead I’d like to point out that the underlying principles of constructor homing, which I believe are:
All the translation units in a program are compiled with the same command line switches,
The lifetime of every C++ object is always completely well defined,
aren’t always true for everyone (closed source code; codebases that cast). Having the option to fall back to vtable homing when your code isn’t ideal provides significant value.
If those principles aren’t true of a code base, as far as I understand the only other option is to use -fstandalone-debug, or suffer incomplete types. Running a quick experiment on a game code base, -fstandalone-debug makes .debug_info eight times larger than with vtable homing. That’s an unacceptably large increase for most developers.
I’d much prefer to keep vtable homing available as a pragmatic trade-off for scenarios where those principles don’t hold, to avoid having to pay the price of -fstandalone-debug. It seems unlikely to me that we’re the only industry who compile with closed-source code or have codebases with a few casts in them. Having compiler switches to support common workflows is well worth the extra code IMHO when the general solution (standalone) is so costly.
This has the feel of C code being compiled as C++, which is a real thing in the wild. I’d expect all sorts of checkpoint-restart kinds of situations to run into this. Although one might argue that the code should be using placement new, or something.
I don’t see any way to make ctor homing work for these.
I appreciate that there are definitely valid use cases that don’t work with constructor type homing, and that standalone debug info is far too expensive for most medium-sized C++ codebases. Even though standalone debug is the default mode on Apple platforms, we have to opt Chrome out of that because it is too expensive. Before we decide to keep or bring back vtable homing, I wanted to explore some of the alternatives.
Remove the constructors: If nobody calls the constructor, it is not needed, and maybe it can be removed. Constructor type homing does not apply to such types.
Mark some constructors constexpr: I beleve types with a constexpr constructor are excluded by the heurstic, since it means the type can be constructed in a way that doesn’t trigger the emission of a constructor.
[[standalone_debug]]: As an escape hatch, developers can apply this attribute to any types with missing type information as a work around.
Regardless of whether we keep vtable homing around, I think there’s a lot of value to the size gains from constructor type homing, and I hope that we can retain it as the default mode. We will need to document these techniques better or we are going to have a lot of frustrated developers.
Speaking of documentation, I think we need to go back and document the assumption that constructor type homing makes. We have the blog post, but this optimization is clearly not that transparent, and it relies on a common-sense understanding of how objects are constructed, rather than requirements from the standard. I remember discussing what the standard requires, and what types of objects can be constructed by type cast (standard layout? trivial? not sure), and there was some rationale which was not written down.
FWIW, I think the only way to disable ctor homing is currently only in -cc1 so it’s not a supported interface. Though I’m guessing the code hanging around has meant that Sony folks have kept non-ctor-homing as the default, or having users use the cc1 flag?
I will say there’s no particular rush to remove ctor homing, I don’t think it’s adding a ton of technical debt. In retrospect (& it was probably my own suggestion/mistake) it probably would’ve been easier to maintain as a totally separate flag that’s a tweak to how limited debug info behaves, would be lower cost to add and remove.
But, yeah, maybe writing up some docs for your users summarizing the sort of things @rnk mentioned (we could write something up about -fno-standalone-debug in general discussing the various homing strategies, pitfalls, and workarounds) & rolling forward, pointing users to that documentation when they hit any cases where -fstandalone-debug fixes an issue.
Or invented our own downstream option for them to use. I believe we have ctor homing as the default, which works for most of our users, but some have run into these cases.
I’ll note that the suggested tweaks to sources in order to improve debug info imply that the devs can change whatever they need to, which in the case of third-party libraries isn’t necessarily straightforward. (Consider what would be required for tweaking a gcc stl header, for example. Maintaining a local modified copy, making sure updates to the original are propagated correctly, … it’s a headache. “Use this handy command-line option” is rather easier.)
Indeed, we’ve provided an “escape-hatch” for people encountering difficulties with constructor homing, as we weren’t completely certain it would work in all scenarios.
We’re working through a few of these now, cheers – there’s a reasonable chance they won’t be palatable to our developers though. As far as I understand it, memory allocation is a serious cost when hitting performance targets in game code, and as a result the lifetime of objects is heavily customised to avoid calling malloc/new. That customisation then means there’s a lot of code in many source files that needs updating or decorating with attributes due to constructor homing, and no automated way of identifying such classes.
Seeing how -fno-use-ctor-homing is for -cc1 and not part of the driver interface, I’ll take a look at whether the diff for vtable homing can be reduced further. That’s beneficial for anyone that continues to support vtable homing.
Actually - while the technology hasn’t been built/streamlined, it would be possible/not too expensive (in both implementation complexity and time to execute such a tool) to identify these cases. That’s the sort of work we did when evaluating the accuracy of ctor homing.
Basically if you can build the entire project with -fdebug-types-section then you can use llvm-dwarfdump’s --summarize-types flag, which prints a list of the types in type units (their DW_AT_name value, if any, their signature, and the length of the unit). Either take that from the final linked program or dwp (if using split DWARF), or merge the results from all object files. Take this, diff it from with/without ctor homing, and you’ll see which types got lost due to ctor homing.
If you wanted a low-cost way to address the issue, you could then go add the [[clang:standalone_debug]] attribute to all the types on the list. Yes, third party libraries present some problems - though even MSVC’s STL has been willing to fix situations where their use of “benign” UB has lead to missing type information due to type homing in clang. If it’s worth something more client-side, I guess we could consider an external feature (maybe Apple already built it - I think they built some way to add attributes to entities extrinsically from an external file) either a list of types in a file that should be treated as standalone, or sometihng like (or exactyl) the Apple feature I’m thinking of (possibly I’m imagining/misremembering) that I think generalizes over any attributes, allowing a consumer to attribute third party code without modifying it.
Apologies if this is pedantic, but we can separate calling the constructor from calling malloc/new. The constructor heuristic should work if programmers use placement new, construct stack objects, or use custom allocators, as long as they call the constructor somewhere in the program. The heuristic won’t work if programmers use memory mapped I/O techniques or type casts to construct objects, and I understand that that’s exactly what they are doing.
I also recall that Apple engineers implemented this API notes feature. Maybe there is some way to use that to apply the standalone_debug attribute to types without modifying the source. I’ve never tried, though, and this is quite complex compared to using the old heuristic. I consider it an option for power users who care about build time and debug info size, and they want to use the new heuristic when possible.
Anyway, developers should have options, and they should be documented. I think it could be worth keeping the old vtable homing code if you still feel it’s useful, but if these various approaches work for people, it’d be nice to nudge them into the new system so we can avoid accumulating heuristics indefinitely.