[DebugInfo] The current status of debug values using multiple machine locations

Following the previous discussion on the mailing list[0], I have been writing a series of patches that implement the proposed instructions[1], enabling multi-location debug values (debug value lists) in LLVM. Although the patches are still in review, the basic implementation is finished (except for the removal and replacement of the old DBG_VALUE instruction, as discussed on the mailing list[2]).

Given below is the change in debug info output for the LLVM test-suite CTMark, from before and after the debug value list patch:

Project               Available variables           PC bytes covered        
                      Old     New  Change         Old       New  Change
7zip                40252   40501   0.62%     7112336   7142255   0.42%
bullet              32655   33296   1.96%     6272034   6323049   0.81%
ClamAV               8795    8842   0.53%     5090634   5099634   0.18%
consumer-typeset     4354    4356   0.05%     3171498   3171605   0.00%
kimwitu++           30006   30177   0.57%     1736826   1755152   1.06%
lencod              14176   14319   1.01%     6123957   6177106   0.87%
mafft                6854    6859   0.07%    12045196  12046744   0.01%
SPASS               38477   38492   0.04%     3396246   3399668   0.10%
sqlite3             29479   30301   2.79%     7964547   8024747   0.76%
tramp3d-v4          91732  105588  15.10%     7925131   8106167   2.28%

As most of the patches have been approved, I am hopeful that the full set of patches will be merged into main in the near future. Part of the purpose of this email is to give notice of the upcoming merge, as the changes are significant and may conflict with any private changes concerning debug info. In terms of output, the patch should not change any existing variable locations; it should only add new locations for some variables. This may break tests that expect certain variables to be missing or optimized out, but should not be disruptive otherwise. If you want to test this patch, either to benchmark compiler performance, gather DWARF statistics, or test its merging with private changes, there is a single patch comprising the entirety of the current work on Phabricator[3].

The other purpose of this email is to request further reviews on the patches, as all but 5 have been accepted and most of the remaining patches have been well-reviewed by now. Due to the size of the patches, there will likely be conflicts between any in-development debug-info work and these patches, creating extra work for any developers that need to update their patches to handle the new instruction. It will also allow current and future work to take advantage of the new functionality to preserve more debug information.

Hi Stephen,

Is it possible to quantify this coverage in absolute terms, at least the PC bytes portion? It would be helpful to understand how close this is bringing us to 100% coverage, for example.

—Owen

It’s hard to know the upper bound on what’s possible.

eg: code like this:

int x = f1();
f2(x);
f2(4);

With optimized code, there’s no way to recover the value of ‘x’ during the second f2 call. We can compute an absolute upper bound that’s certainly unreachable - by looking at the scope of variables (assuming our scope tracking is perfect - which, it’s not bad, but can get weird under optimizations) and comparing total scope bytes of variables compared to the bytes for which a location is described. We do have those two stats in llvm-dwarfdump --statistics. But generally we don’t bother looking at that because it’s a fair way off and the limitations of any such measurement as I’ve described here. (we also don’t currently track where a variable’s scope starts - so the upper bound for “x” in “{ f1(); int x = …; f1(); }” includes both calls to f1, even though the location shouldn’t ever extend to cover the first f1 call)

As David has said, the coverage % is not an especially meaningful number in general because we do not have a general method of determining the true upper bound of coverage for an optimised program. To answer your question as best I can though, here are the coverage numbers:

Project Variable availability PC ranges covered
Old New Delta Old New Delta
7zip 79.48% 80.01% 0.53% 60.55% 60.73% 0.18%
bullet 44.57% 45.21% 0.65% 55.55% 56.00% 0.46%
ClamAV 88.89% 89.37% 0.48% 53.48% 53.57% 0.09%
consumer-typeset 91.62% 91.44% -0.19% 32.48% 32.48% 0.00%
kimwitu++ 68.34% 68.75% 0.41% 69.12% 69.82% 0.70%
lencod 89.86% 90.77% 0.91% 48.41% 48.83% 0.42%
mafft 89.26% 89.14% -0.12% 57.89% 57.89% 0.00%
SPASS 83.23% 83.26% 0.03% 52.61% 52.66% 0.05%
sqlite3 73.55% 75.62% 2.07% 51.59% 52.01% 0.43%
tramp3d-v4 54.77% 63.04% 8.27% 66.67% 68.16% 1.49%

These numbers are not high resolution - the change is simply the difference of the rounded “old” and “new” numbers. Notably, the variable availability for some of the projects has actually gone down, as we have more variables being emitted to DWARF with 0% coverage (the DWARF emission of variables with 0% coverage is an issue in itself, but not one introduced or fixed by this patch). The PC bytes numbers are also slightly misleading, as the % is calculated as “the sum of PC bytes covered for each variable” divided by “the sum of PC bytes in the parent scope for each variable”. This means that if, for example, we doubled the number of variables covered by the program but all of the new variables had slightly lower average coverage than the old variables, we would see this number decrease despite the clear increase in actual coverage.

As you can see, these numbers aren’t as helpful as we’d like - for example, we could easily hit 100% coverage by choosing not to emit any variables that don’t have a location for their entire scope, but this would not translate to a better debug experience. We could compare the number of available variables with the program at O0, but this also does not work out as it might first seem, because optimizations can increase the number of variables by inlining functions; for all of these projects, the number of variables at O2 is several times larger than the number at O0.

Hopefully this summarizes why comparing the raw variable counts and PC bytes covered is, as far as I can tell, the best way of comparing the actual change in debug quality between the two patches.

As David has said, the coverage % is not an especially meaningful number in general because we do not have a general method of determining the true upper bound of coverage for an optimised program. To answer your question as best I can though, here are the coverage numbers:

Project Variable availability PC ranges covered
Old New Delta Old New Delta
7zip 79.48% 80.01% 0.53% 60.55% 60.73% 0.18%
bullet 44.57% 45.21% 0.65% 55.55% 56.00% 0.46%
ClamAV 88.89% 89.37% 0.48% 53.48% 53.57% 0.09%
consumer-typeset 91.62% 91.44% -0.19% 32.48% 32.48% 0.00%
kimwitu++ 68.34% 68.75% 0.41% 69.12% 69.82% 0.70%
lencod 89.86% 90.77% 0.91% 48.41% 48.83% 0.42%
mafft 89.26% 89.14% -0.12% 57.89% 57.89% 0.00%
SPASS 83.23% 83.26% 0.03% 52.61% 52.66% 0.05%
sqlite3 73.55% 75.62% 2.07% 51.59% 52.01% 0.43%
tramp3d-v4 54.77% 63.04% 8.27% 66.67% 68.16% 1.49%

These numbers are not high resolution - the change is simply the difference of the rounded “old” and “new” numbers. Notably, the variable availability for some of the projects has actually gone down, as we have more variables being emitted to DWARF with 0% coverage (the DWARF emission of variables with 0% coverage is an issue in itself, but not one introduced or fixed by this patch). The PC bytes numbers are also slightly misleading, as the % is calculated as “the sum of PC bytes covered for each variable” divided by “the sum of PC bytes in the parent scope for each variable”. This means that if, for example, we doubled the number of variables covered by the program but all of the new variables had slightly lower average coverage than the old variables, we would see this number decrease despite the clear increase in actual coverage.

Hmm, that seems like a somewhat unhelpful statistic - when you say “more variables being emitted to DWARF with 0% coverage” - what do you mean by that? Are we counting a variable with no location attribute as being 100% covered, because it isn’t partially covered? Could we instead count such variables as 0% covered?

when you say “more variables being emitted to DWARF with 0% coverage” - what do you mean by that? Are we counting a variable with no location attribute as being 100% covered, because it isn’t partially covered? Could we instead count such variables as 0% covered?

Not exactly, it’s that we aren’t considering them at all. If we have 2 variables a​ and b​ with 100% and 50% coverage respectively, we’d have 75% total coverage (assuming they both occupy identically sized scopes). If a new optimization causes us to drop b​’s coverage to 0%, then we would expect the overall coverage to drop to 50%. However, we may end up dropping b​ entirely from the DWARF, in which case the dwarfdump​ statistics see that we only have 1 variable and it has 100% coverage, giving us 100% overall coverage. While it would be accurate to count these missing variables as 0% covered, if there’s no mention of them in the DWARF then there is no way for dwarfdump​ to know they exist.

I’m not actually sure what causes variables to be dropped from the DWARF entirely, as opposed to them existing but having an unknown location for their entire scope; however, outside of our desire to use dwarfdump​ to analyze our debug info it’s simply more efficient to omit variables with no location, since they inflate the debug info size and I don’t believe there’s any practical value in having them.

when you say “more variables being emitted to DWARF with 0% coverage” - what do you mean by that? Are we counting a variable with no location attribute as being 100% covered, because it isn’t partially covered? Could we instead count such variables as 0% covered?

Not exactly, it’s that we aren’t considering them at all. If we have 2 variables a and b with 100% and 50% coverage respectively, we’d have 75% total coverage (assuming they both occupy identically sized scopes). If a new optimization causes us to drop b's coverage to 0%, then we would expect the overall coverage to drop to 50%. However, we may end up dropping b entirely from the DWARF,

When does this ^ happen? In optimized builds we include all local variables in a “variables” attachment to the DISubprogram, so we shouldn’t be losing variables entirely.

in which case the dwarfdump statistics see that we only have 1 variable and it has 100% coverage, giving us 100% overall coverage. While it would be accurate to count these missing variables as 0% covered, if there’s no mention of them in the DWARF then there is no way for dwarfdump to know they exist.

I’m not actually sure what causes variables to be dropped from the DWARF entirely, as opposed to them existing but having an unknown location for their entire scope; however, outside of our desire to use dwarfdump to analyze our debug info it’s simply more efficient to omit variables with no location, since they inflate the debug info size and I don’t believe there’s any practical value in having them.

I think it’s pretty important that we keep them. It helps a user understand that they’ve not mistyped the name of a variable, etc - that it’s there, but has no location is distinct from it not being there at all. A marginal motivation is that this can also produce even more surprising behavior in the case of shadowing - if a shadowing variable is omitted entirely, when the user tries to print it they might get the shadowed variable and mistake it for the shadowing one. Admittedly DWARF doesn’t make much of an effort to be reproduce shadowing perfectly - functions and other entities that are never referenced are generally omitted and can produce this kind of shadowing/incorrect lookup behavior - so it’s a marginal motivation.

  • Dave