RFC: Source-based MC/DC Code Coverage

Last January, I upstreamed support for Source-based Branch Coverage (Revision https://reviews.llvm.org/D84467). I also spoke about this work at the 2020 LLVM Developers’ Meeting. It was a lot of fun to work on!

As I mentioned then, I have had an interest in extending this support to enable Modified Condition/Decision Coverage (MC/DC), and I’d like to start working on that very soon. Based on what I learned from others, there is a lot of interest in supporting MC/DC with LLVM. MC/DC is a requirement for functional safety critical development, in particular, in the embedded space. More about MC/DC: https://en.wikipedia.org/wiki/Modified_condition/decision_coverage

Background

The key thing added by MC/DC (from the DO-178C standard) is to show that, “Each condition in a decision is shown to independently affect the outcome of a decision.” A condition is shown to affect a decision’s outcome independently “by varying just that condition while holding fixed all other possible conditions”. An addendum to the DO-178C standard definition clarifies, “or varying just that condition while holding fixed all other possible conditions that could affect the outcome”, which allows us to ignore unevaluatable conditions in languages that have short-circuit behavior on logical operations && and || (like C/C++). This is what I plan to implement.

For a Boolean logical expression consisting of one or more logical operators, the individual condition True/False path of execution constitutes a ‘test vector’ yielding a decision outcome. For example, for “a && b”, there are three possible test vectors (‘-‘ signifies an unevaluatable ‘don’t-care’ condition due to short-circuit semantics).

a b Result

1: F, -, F {F,-,F}

2: T, F, F {T,F,F}

3: T, T, T {T,T,T}

MC/DC effectively requires N+1 test vectors to verify (where N is the number of conditions). In the example above, all of the test vectors must be executed in order to show that both conditions, a and b, independently affect the outcome. More precisely, two test vectors are required for each condition to show MC/DC, forming an “independence pair” for each condition (in the above example, ‘a’ can be shown via test vectors 1 and 3, and ‘b’ can be shown via test vectors 2 and 3). The goal for LLVM is to enable llvm-cov to determine the executed test vectors and analyze them for each condition/decision outcome to prove that each condition independently affects the outcome. The overall metric is simply the number of executed “independence pairs” vs. total expected for each condition.

I’m just sketching out the design. Similar to Branch Coverage, there are three primary areas that I am proposing to extend:

FYI for the llvm-dev list, I have decided to change my originally proposed design after some email exchanges with Vedant Kumar that will prevent the exponential counter explosion I identified in #2.b. of the original proposal. Instead, I’d like to implement a log-based solution that tracks executed test vectors using an instrumented, variable-length bitmap, one per Boolean expression. The log-based approach scales much better, seems like a cleaner design, and is closer to what other vendors do to implement MC/DC.

Here is an outline of the new design proposal overall:

To revisit this topic of enabling MC/DC for LLVM Source-based Code Coverage, I’ve just now posted a review to upstream this work:

https://reviews.llvm.org/D136385

I’ve added @vedantk and @petrhosek as reviewers based on interest, but as everyone is very busy, it would be useful to have some more folks involved.

  • @mwe You asked about supporting the upstream effort – are you interested in helping with the review?

  • @ellishg you recently made changes to InstrProf/InstrProfiling infrastructure for Debug Correlation – given that I’ve refactored parts of those files to abstract some of the functionality, would you be interested in reviewing that? I’d also be interested if there could be a future patch enabling MC/DC support with the correlator (if that makes sense).

  • @smithp35 I’ve subscribed you as you said you could make general comments but couldn’t approve.

This will also be the subject of my talk at the LLVM meeting next month.

Anyone else interested, please let me know.

I’m happy to review the Debug Info Correlation bits!
Right now Lightweight Instrumentation doesn’t support clang instrumentation or llvm-cov. I don’t know how much work needs to be done to support this, but it would be a really great addition.

As for ⚙ D136385 Add support for MC/DC in LLVM Source-Based Code Coverage it might be easier to review if it was broken up into chunks. Is it reasonable to turn it into a stack of diffs?

I look forward to hearing your talk!

Thank you!

Yes, I will see if I can do this. I explored it a bit earlier, but I wasn’t sure there was an effective way to stack them given the module interdependencies introduced by the format changes, but perhaps I wasn’t thinking granular enough.

I attended your Dev Meeting talk this week, and it sounds like an improvement in validating the thoroughness of testing.

I remember you took a question about applicability in other areas besides simply showing better code coverage. It occurred to me that it might be handy if, when someone posts a patch, it would be easy to show that the test exercises all the paths through a condition. I guess that could be done without any additional tooling, although it would be very manual (look at the coverage report for each compound condition in the modified code). Or maybe that kind of tooling would be too heavyweight for the benefit, I don’t know.

Hi!

It occurred to me that it might be handy if, when someone posts a patch, it would be easy to show that the test exercises all the paths through a condition. I guess that could be done without any additional tooling, although it would be very manual (look at the coverage report for each compound condition in the modified code). Or maybe that kind of tooling would be too heavyweight for the benefit, I don’t know.

I like the ideas about how coverage can be better leveraged as part of the patch process.

Is the idea that this would run as part of the buildbot testing process for LLVM? Or is the question related to more generic testing of downstream patches? I typically do a 2-stage build with the second stage binaries also instrumented so that I can track coverage, and it would be great of there was a buildbot process to do this (or maybe there already is?). However, if that is the case, my suspicion is that MC/DC may be too heavy-handed for a lot of cases (particularly since we limit the number of conditions in a boolean expression to 6, although this could easily be made configurable).

Whether it is run as part of the buildbots or part of a downstream process, the results would need to be checked, and this could probably be easily done via the JSON exporting from llvm-cov. Through that, it’s pretty easy to see the level of coverage (whether MC/DC or branch coverage).

-Alan

Hi,

The source based coverage is quite interesting. I have done some work in this area recently and I have written a patch for gcc that implements MC/DC support in gcc by doing control flow graph analysis. A major benefit is that targeting the CFG makes coverage available to more frontends, but there are other (more controversial, I suppose) benefits, too, like coverage of implicit bounds checks or maybe stricter checks through unfolding of booleans.

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605699.html

I’m not too familiar with the LLVM internals, but I assume the algorithm should work fine on the LLVM CFG too. It would be interesting to compare notes, or at least test suites.

Hi jkv! Very interesting, thanks for sharing that.

LLVM has the benefit of allowing us to tie directly to the abstract syntax tree representation and instrument during LLVM IR lowering, meaning that optimizations won’t impact the accuracy of the coverage.

I haven’t dug deeply into your GCC patch. I also previously read SQLite’s description of MC/DC and had concerns about what they were doing with branch coverage to achieve that. It sounds like you are doing something different. They claim that MC/DC and branch coverage are “are very nearly the same thing”, but there are cases in which it is possible to have 100% branch condition coverage and yet not achieve MC/DC. I added branch condition coverage to LLVM in 2021 (Source-based Code Coverage — Clang 16.0.0git documentation), but I opted not to follow SQLite when implementing MC/DC.

For background (for the record), measuring a condition’s ability to independently impact a decision’s outcome requires the following to be shown:

This means that the MC/DC analysis really needs to track executed test vectors and analyze them to find pairs of test vectors in order to ascertain whether the MC/DC constraints are satisfied for each condition. This is the approach supported by other vendors, including LDRA and VectorCAST, and is what I have implemented for LLVM. The source-view visualization looks like this:

    1|       |#include "mcdc-demo.h"
    2|       |
    3|      3|bool test(bool A, bool B, bool C) {
    4|      3|  return ((A && B) || C);
  ------------------
  |  Branch (4:12): [True: 2, False: 1]
  |  Branch (4:17): [True: 1, False: 1]
  |  Branch (4:23): [True: 0, False: 2]
  ------------------
  |---> MC/DC Decision Region (4:11) to (4:24)
  |
  |  Number of Conditions: 3
  |     Condition C1 --> (4:12)
  |     Condition C2 --> (4:17)
  |     Condition C3 --> (4:23)
  |
  |  Executed MC/DC Test Vectors:
  |
  |     C1, C2, C3    Result
  |  1 { F,  -,  F  = F      }
  |  2 { T,  F,  F  = F      }
  |  3 { T,  T,  -  = T      }
  |
  |  C1-Pair: covered: (1,3)
  |  C2-Pair: covered: (2,3)
  |  C3-Pair: not covered
  |  MC/DC Coverage for Decision: 66.67%
  |
  ------------------

-Alan

For short-circuiting languages where complex expressions are broken down (at CFG level) into a branch-per-condition, branch-coverage and MC/DC are very close indeed, but branch coverage does not handle masked conditions, i.e. branches that were exercised but didn’t actually matter for the outcome. My algorithm, however, handles this well, as you can see in the test suite.

Indeed. In the CFG approach I determine what set of conditions (possibly empty) are masked whenever a condition is exercised. Granted, the patch does not have very good source visualisation yet, that’s a problem for the future.