Good day for everyone .
We benchmarked the code coverage algorithms like
a)Optimal Edge Profiling
(ftp://ftp.cs.wisc.edu/pub/techreports/1991/TR1031.pdf .) that are
adopted by GCC and LLVM
and i don't see the practical implementation of this.
Goal is to reduce the instrumentation point and when we benchmarked
both algorithms .The number of blocks that are required to instrument
is less in option-b over option-a.
Need some expert insights on this like why the option-b is not widely
used over option-a.
Clang uses a different approach with its -fprofile-instr-generate option. The instrumentation is based on source-level control flow constructs. Intuitively, for well structured code, it should get similar efficiency to techniques like the option (b) cited above, but I’m not aware of any quantitative comparisons. If you can provide any data on that comparison for real-world code, it would be very interesting to see how we’re doing.
My apologies if I shouldn't have sent this answer to all on this list, but since this is the first time I ever replied on any GCC/LLVM thread, I'm taking a guess. I'm finishing a Ph.D. in static analysis and I have lectured compiler courses in the last few years as well as implemented some instrumenters and can share some insights on your problem. However, I would first suggest you deeply read both references you gave because the devil clearly lies in the fine prints of those papers.
In my understanding, both techniques rely on the very intuitive fact that only certain nodes in a CFG are necessary to deduce the complete trace of a program (i.e., you can compress the information because of the natural redundancy of the CFG if you restrict desired information. For instance, if you want to know the value of every variable at every point in a program, chances are you will need more than if you just want to know the concrete flow). On the one hand, you have a technique that uses a maximum spanning-tree and on the other you have one that uses dominators (possibly extended to use post-dominators). If you are familiar with dominators, then it should be clear that both techniques directly or indirectly infers the interesting nodes from the predicates nodes (dominator trees actually split levels at nodes that are predicates most of the times). You would expect both to give approximately the same performance, but your benchmark says otherwise. Is it a contradiction ?
Careful reading of the ISSTA paper reveals that coverage of this algorithm is likely limited to block coverage, and cannot deduce edge coverage (not 100% sure on this one) and path coverage (much more sure about this one). When they do compare their work to that of Ball (reference 15 in the paper), they do not seem to say they do better, but simply that they do faster on block coverage. Highly likely that this algorithm is limited to statement coverage and frequencies. I didn't do the maths nor checked what specifically is missing from the dominator tree that makes path reconstruction impossible, but intuitively if I do not preserve any ordering in the increment of the counters, I will not be able to do path reconstruction. You should note that your solution-a in the provided reference is actually extended in that afore-mentioned reference 15.
TL;DR: Both techniques do not share the same coverage analysis power, so they perform differently. Also, Ball paper was published 10 years earlier, so time precedence may also play a role in that preference.
Feel free to discuss it more if you want.
P.S. "For each basic block to be instrumented we create a Boolean
variable which is initialized to false indicating that the block has
not yet executed. " Read section 4 of the ISSTA paper. You should convince yourself that if I only give you which nodes in a boolean fashion, path reconstruction is impossible, even in easy cases. The paper from Ball clearly solves the "tracing" problem, and not a subset of the "coverage" problem.