Awesome to read how it's coming along - I'm mostly aside from the
debug location work, but had just one or two clarifying questions
Hi debug-info folks,
Time for another update on the variable location "instruction referencing"
implementation I've been doing, see this RFC [0, 1] for background. It's now at
the point where I'd call it "done" (as far as software ever is), and so it's a
good time to look at what results it produces. And here are the
scores-on-the-doors using llvm-locstats, on clang-3.4 RelWithDebInfo first in
"normal" mode and then with -Xclang -fexperimental-debug-variable-locations.
"normal":
=================================================
cov% samples percentage(~)
-------------------------------------------------
0% 765406 22%
(0%,10%) 45179 1%
[10%,20%) 51699 1%
[20%,30%) 52044 1%
[30%,40%) 46905 1%
[40%,50%) 48292 1%
[50%,60%) 61342 1%
[60%,70%) 58315 1%
[70%,80%) 69848 2%
[80%,90%) 81937 2%
[90%,100%) 101384 2%
100% 2032034 59%
-the number of debug variables processed: 3414385
-PC ranges covered: 61%
-------------------------------------------------
-total availability: 64%
With instruction referencing:
=================================================
cov% samples percentage(~)
-------------------------------------------------
0% 751201 21%
(0%,10%) 40708 1%
[10%,20%) 44909 1%
[20%,30%) 47544 1%
[30%,40%) 41630 1%
[40%,50%) 42742 1%
[50%,60%) 56692 1%
[60%,70%) 53796 1%
[70%,80%) 64476 1%
[80%,90%) 73836 2%
[90%,100%) 74423 2%
100% 2123749 62%
-the number of debug variables processed: 3415706
-PC ranges covered: 68%
-------------------------------------------------
-total availability: 64%
The first observation: a significant increase in the byte-coverage statistic,
meaning that we're able to track variable locations for longer and across more
code. This was one of the main aims of this work, having better tracking of
the locations that we know. The increase of seven percentage points includes an
additional two percentage points of entry-value locations. If we disable entry
value production then the scope-bytes-covered statistic moves from 59% to 64%,
Was this meant to be "from 64% to 59%"?
How does that compare to the baseline no-entry-value number?
Could you give a quick summary of the distinction between "PC ranges
covered" and "total availability"?
which is still a decent improvement.
The next observation is that the ``total availability'' of variables hasn't
changed. This isn't the fully story -- if you give an absolute name to every
variable with a location in the clang binary, there are 6949 dropped locations
and 22564 completely new locations, meaning roughly 1% of all variables in the
program have changed, it's just hidden by the statistics rounding. More detail
on the nature of the changes are below. I was hoping for more false locations
to be dropped; it's quite likely that there are many more false locations
dropped within variables that have more than one value, which aren't readily
reflected in these statistics.
A natural question is: are all these new locations wrong, and the dropped
locations only dropped because of bugs? To address that, I picked 20 new
locations and 20 dropped locations at random and analysed why they happened.
The input samples can be found here [2], along with an llvm-reduce'd version of
each IR file. I confirmed the reason for the new/dropped location in the
reduced and original file, as llvm-reducing them can alter the reason why
something is dropped or not. Of the new locations, we previously could not
track the location because:
* 14 DBG_VALUEs come after the vreg operand is out of liveness and are dropped
by LiveDebugVariables.
* 2 DBG_VALUEs are out of liveness and dropped by RegisterCoalescing
out of conservativeness.
* 2 DBG_VALUEs that appear before their operand is defined. This is out of
liveness, instruction referencing saves them through preserving debug
use-before-defs.
* 2 DBG_VALUEs that are out of liveness after a branch, but the value is live
down the other branch path.
All of these locations can be tracked with instruction referencing because
liveness is not a consideration, only availability in physical registers. 19 of
the new locations were correct, while one tracked the right value but picked
the wrong location for it, which I've now got a patch for.
For the dropped locations:
* 8 false locations are dropped, they used to refer to the wrong value because
of a failure in register coalescing, see the body of [3].
Would these issues ^ show up/be testable with Dexter?