Always running Regular LTO before ThinLTO?

While debugging a ThinLTO issue, I noticed that passes were running more often than I expected. This is being invoked pretty normally, something like
$ clang++ -fuse-ld=lld -flto=thin

Looks like at 1 we try to first run regular LTO before running ThinLTO. Removing the first line makes the link go from 208s to 115s. Is this expected behavior?

Ah I messed something up locally, removing the regular LTO run causes undefined hidden symbol errors for vtables.
The runRegularLTO() name is confusing though, it apparently sets up some things necessary for ThinLTO.

Looks like the secondary split LTO units (e.g. due to -fwhole-program-vtables) are being added together into one large module which is then monolithically LTO’d, which is where the extra passes were being run.

splitAndWriteThinLTOBitcode():

// Mark the merged module as requiring full LTO. We still want an index for
// it though, so that it can participate in summary-based dead stripping.

Looks like the secondary split LTO units (e.g. due to -fwhole-program-vtables) are being added together into one large module which is then monolithically LTO’d, which is where the extra passes were being run.

Right. Note you can run WPD in ThinLTO only mode by passing -fno-split-lto-unit, which is a little less powerful in terms of optimization (can’t do virtual constant prop, although I’m not sure how often this kicks in), and doesn’t support CFI. Internally for our ThinLTO builds when performing WPD for optimization (i.e. without CFI) we disable the splitting for compile time scalability reasons.

Teresa