[RFC] Region/Branch coverage by Bitmap

chapuni · June 17, 2024, 4:09am

This is under design.

Abstract

All data use bitmaps. Counters are not used.
Each conditional branch has at most two bits.
Each region has at least one bit and incoming branches.
- At most one bit is used for unconditional branch. Zero bit if incomings don’t have any unconditional branches.
- The result of a branch (either true or fail) is allocated into corresponding outcome.
- The aggregate (OR) of incomings (and an unconditional branch) represents the result of the region.
Branches may be unified to MC/DC coverage.
- MC/DC test vectors may cover branches under its umbrella for reduction.
- In “Strict branch mode”, Identical (width is 1) test vectors may be replaced to corresponding final condition for reduction. Branches are not omitted.

Problems and prerequisites

Some users are not interested in counters in branch/region. (In fact, MC/DC test vectors don’t have counters). They are interested in whether regions/branches are executed or not.
Incrementing counters will cost too much.
- Atomic increment is not cheap. It cannot be omitted. Even if increments would be hoisted out of the loop to reduce atomic increments, it wouldn’t resolve the root problem.
- On the multiprocessor system, updating the cache line will pay too much. It triggers broadcasting of update or flush to other processors. Counters always write.
  - The thing would go worse on NUMA architecture.
Unconditional write (implemented in “Single Byte Counters”) wouldn’t improve cache line problem.
- The cost would be reduced very much to introduce “Test or write”. The counter (actually Boolean flag) is irreversible to “not executed”.
  - “Test or write” will be better than “unconditional write”. This means it wouldn’t be the matter even if the operation is bitwise OR.
    - Higher cost of “Atomic OR to memory” will be hidden since its execution may be considered rare.

Design basics

Again, this is under design.

Each condition (True/False) on the branch has the destination region and the index in the region. E.g. “Region #X, subindex #Y”. The addressing can be reduced to one dimension with post processing.

Each region has at most one bit for unconditional branches. It shall be at most one bit if the number of incomings is greater than 1. This may be zero bit if the region doesn’t have any unconditional incomings. This is set to true at the top of the region.

Each region has at least zero bits as incoming conditional branches. They are set to true on the edge from the branch to the region, conceptually.

The length of each region is:

1 bit if the region has any unconditional incomings.
N bit for the number of incoming conditional branches.

Branch Coverage and MC/DC Test Vectors

Each test vector can determine a condition (True/False/DontCare) under the test vector. This allows branch coverage under MC/DC expressions to be omitted. The backend must reconstruct the branch coverage from the test vectors.

MC/DC coverage does not write results back to memory until all conditions are determined. This can result in unwritten results if a crash occurs midway through the evaluation of the expression. If users want to avoid it, we need to provide an option not to omit branch coverage on the MC/DC expression.

MC/DC test vectors cannot be reconstructed from corresponding branch coverages. A few test vectors can be reconstructed if it has “Width is 1”. So, "tests with w=1 can be omitted from the set of test vectors in this case.

Afterword

The motivation for doing this is to improve the efficiency of “continuous coverage mode”. I want to make it available on Linux by default.

Thank you.

evodius96 · June 17, 2024, 3:55pm

Thanks, @chapuni

Representing condition evaluations using bits and eliminating counters is definitely a plus. If it includes region coverage as well, you could use this for function and line coverage too (basically all coverage criteria), correct?

By “region” are you referring to coverage regions? i.e. a BranchRegion, part of CoverageMappingRegion that allows us to map the branch back to the source code? You’re proposing extending this to include special indices for a bitmap? When you can provide an example, that would be useful.

Each test vector can determine a condition (True/False/DontCare) under the test vector. This allows branch coverage under MC/DC expressions to be omitted. The backend must reconstruct the branch coverage from the test vectors… If users want to avoid it, we need to provide an option not to omit branch coverage on the MC/DC expression.

I thought about this as well when implementing MC/DC – we can use the bitmaps to generate branch coverage in a fairly straightforward manner, though with the granularity you suggest above, that will involve some addition instrumentation to ‘write back’ the condition evaluation. Without the overhead of counters, I would think that trade-off is acceptable.

chapuni · June 20, 2024, 8:29am

Thanks @evodius96 for the comment.

If it includes region coverage as well, you could use this for function and line coverage too (basically all coverage criteria), correct?

Definitely yes. I suppose that function coverage and line coverage can be induced by region coverage.

By “region” are you referring to coverage regions?

I suppose so. I am not sure my design would fit.

… that will involve some addition instrumentation to ‘write back’ the condition evaluation.

I suppose writing back a few bits to the bitmap (as branch coverage) would be cheaper than increasing the counter atomically.

Topic		Replies	Views
[RFC][Coverage] New algorithm and file format for MC/DC LLVM Project	6	819	June 13, 2024
Adding support for LLVM Branch Condition Coverage LLVM Dev List Archives	7	200	January 24, 2020
[RFC] Integrating SingleByteCoverage with Branch Coverage LLVM Project	6	280	November 21, 2024
[RFC] Single Byte Counters for Source-based Code Coverage CompilerRT	14	988	June 14, 2024
How to get branch coverage by using 'source-based code coverage' LLVM Dev List Archives	6	257	May 6, 2020