RFC: Source-based MC/DC Code Coverage

Last January, I upstreamed support for Source-based Branch Coverage (Revision https://reviews.llvm.org/D84467). I also spoke about this work at the 2020 LLVM Developers’ Meeting. It was a lot of fun to work on!

As I mentioned then, I have had an interest in extending this support to enable Modified Condition/Decision Coverage (MC/DC), and I’d like to start working on that very soon. Based on what I learned from others, there is a lot of interest in supporting MC/DC with LLVM. MC/DC is a requirement for functional safety critical development, in particular, in the embedded space. More about MC/DC: https://en.wikipedia.org/wiki/Modified_condition/decision_coverage

Background

The key thing added by MC/DC (from the DO-178C standard) is to show that, “Each condition in a decision is shown to independently affect the outcome of a decision.” A condition is shown to affect a decision’s outcome independently “by varying just that condition while holding fixed all other possible conditions”. An addendum to the DO-178C standard definition clarifies, “or varying just that condition while holding fixed all other possible conditions that could affect the outcome”, which allows us to ignore unevaluatable conditions in languages that have short-circuit behavior on logical operations && and || (like C/C++). This is what I plan to implement.

For a Boolean logical expression consisting of one or more logical operators, the individual condition True/False path of execution constitutes a ‘test vector’ yielding a decision outcome. For example, for “a && b”, there are three possible test vectors (‘-‘ signifies an unevaluatable ‘don’t-care’ condition due to short-circuit semantics).

a b Result

1: F, -, F {F,-,F}

2: T, F, F {T,F,F}

3: T, T, T {T,T,T}

MC/DC effectively requires N+1 test vectors to verify (where N is the number of conditions). In the example above, all of the test vectors must be executed in order to show that both conditions, a and b, independently affect the outcome. More precisely, two test vectors are required for each condition to show MC/DC, forming an “independence pair” for each condition (in the above example, ‘a’ can be shown via test vectors 1 and 3, and ‘b’ can be shown via test vectors 2 and 3). The goal for LLVM is to enable llvm-cov to determine the executed test vectors and analyze them for each condition/decision outcome to prove that each condition independently affects the outcome. The overall metric is simply the number of executed “independence pairs” vs. total expected for each condition.

I’m just sketching out the design. Similar to Branch Coverage, there are three primary areas that I am proposing to extend:

FYI for the llvm-dev list, I have decided to change my originally proposed design after some email exchanges with Vedant Kumar that will prevent the exponential counter explosion I identified in #2.b. of the original proposal. Instead, I’d like to implement a log-based solution that tracks executed test vectors using an instrumented, variable-length bitmap, one per Boolean expression. The log-based approach scales much better, seems like a cleaner design, and is closer to what other vendors do to implement MC/DC.

Here is an outline of the new design proposal overall:

To revisit this topic of enabling MC/DC for LLVM Source-based Code Coverage, I’ve just now posted a review to upstream this work:

https://reviews.llvm.org/D136385

I’ve added @vedantk and @petrhosek as reviewers based on interest, but as everyone is very busy, it would be useful to have some more folks involved.

  • @mwe You asked about supporting the upstream effort – are you interested in helping with the review?

  • @ellishg you recently made changes to InstrProf/InstrProfiling infrastructure for Debug Correlation – given that I’ve refactored parts of those files to abstract some of the functionality, would you be interested in reviewing that? I’d also be interested if there could be a future patch enabling MC/DC support with the correlator (if that makes sense).

  • @smithp35 I’ve subscribed you as you said you could make general comments but couldn’t approve.

This will also be the subject of my talk at the LLVM meeting next month.

Anyone else interested, please let me know.

I’m happy to review the Debug Info Correlation bits!
Right now Lightweight Instrumentation doesn’t support clang instrumentation or llvm-cov. I don’t know how much work needs to be done to support this, but it would be a really great addition.

As for ⚙ D136385 Add support for MC/DC in LLVM Source-Based Code Coverage it might be easier to review if it was broken up into chunks. Is it reasonable to turn it into a stack of diffs?

I look forward to hearing your talk!

Thank you!

Yes, I will see if I can do this. I explored it a bit earlier, but I wasn’t sure there was an effective way to stack them given the module interdependencies introduced by the format changes, but perhaps I wasn’t thinking granular enough.

I attended your Dev Meeting talk this week, and it sounds like an improvement in validating the thoroughness of testing.

I remember you took a question about applicability in other areas besides simply showing better code coverage. It occurred to me that it might be handy if, when someone posts a patch, it would be easy to show that the test exercises all the paths through a condition. I guess that could be done without any additional tooling, although it would be very manual (look at the coverage report for each compound condition in the modified code). Or maybe that kind of tooling would be too heavyweight for the benefit, I don’t know.

Hi!

It occurred to me that it might be handy if, when someone posts a patch, it would be easy to show that the test exercises all the paths through a condition. I guess that could be done without any additional tooling, although it would be very manual (look at the coverage report for each compound condition in the modified code). Or maybe that kind of tooling would be too heavyweight for the benefit, I don’t know.

I like the ideas about how coverage can be better leveraged as part of the patch process.

Is the idea that this would run as part of the buildbot testing process for LLVM? Or is the question related to more generic testing of downstream patches? I typically do a 2-stage build with the second stage binaries also instrumented so that I can track coverage, and it would be great of there was a buildbot process to do this (or maybe there already is?). However, if that is the case, my suspicion is that MC/DC may be too heavy-handed for a lot of cases (particularly since we limit the number of conditions in a boolean expression to 6, although this could easily be made configurable).

Whether it is run as part of the buildbots or part of a downstream process, the results would need to be checked, and this could probably be easily done via the JSON exporting from llvm-cov. Through that, it’s pretty easy to see the level of coverage (whether MC/DC or branch coverage).

-Alan

Hi,

The source based coverage is quite interesting. I have done some work in this area recently and I have written a patch for gcc that implements MC/DC support in gcc by doing control flow graph analysis. A major benefit is that targeting the CFG makes coverage available to more frontends, but there are other (more controversial, I suppose) benefits, too, like coverage of implicit bounds checks or maybe stricter checks through unfolding of booleans.

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605699.html

I’m not too familiar with the LLVM internals, but I assume the algorithm should work fine on the LLVM CFG too. It would be interesting to compare notes, or at least test suites.

Hi jkv! Very interesting, thanks for sharing that.

LLVM has the benefit of allowing us to tie directly to the abstract syntax tree representation and instrument during LLVM IR lowering, meaning that optimizations won’t impact the accuracy of the coverage.

I haven’t dug deeply into your GCC patch. I also previously read SQLite’s description of MC/DC and had concerns about what they were doing with branch coverage to achieve that. It sounds like you are doing something different. They claim that MC/DC and branch coverage are “are very nearly the same thing”, but there are cases in which it is possible to have 100% branch condition coverage and yet not achieve MC/DC. I added branch condition coverage to LLVM in 2021 (Source-based Code Coverage — Clang 18.0.0git documentation), but I opted not to follow SQLite when implementing MC/DC.

For background (for the record), measuring a condition’s ability to independently impact a decision’s outcome requires the following to be shown:

This means that the MC/DC analysis really needs to track executed test vectors and analyze them to find pairs of test vectors in order to ascertain whether the MC/DC constraints are satisfied for each condition. This is the approach supported by other vendors, including LDRA and VectorCAST, and is what I have implemented for LLVM. The source-view visualization looks like this:

    1|       |#include "mcdc-demo.h"
    2|       |
    3|      3|bool test(bool A, bool B, bool C) {
    4|      3|  return ((A && B) || C);
  ------------------
  |  Branch (4:12): [True: 2, False: 1]
  |  Branch (4:17): [True: 1, False: 1]
  |  Branch (4:23): [True: 0, False: 2]
  ------------------
  |---> MC/DC Decision Region (4:11) to (4:24)
  |
  |  Number of Conditions: 3
  |     Condition C1 --> (4:12)
  |     Condition C2 --> (4:17)
  |     Condition C3 --> (4:23)
  |
  |  Executed MC/DC Test Vectors:
  |
  |     C1, C2, C3    Result
  |  1 { F,  -,  F  = F      }
  |  2 { T,  F,  F  = F      }
  |  3 { T,  T,  -  = T      }
  |
  |  C1-Pair: covered: (1,3)
  |  C2-Pair: covered: (2,3)
  |  C3-Pair: not covered
  |  MC/DC Coverage for Decision: 66.67%
  |
  ------------------

-Alan

For short-circuiting languages where complex expressions are broken down (at CFG level) into a branch-per-condition, branch-coverage and MC/DC are very close indeed, but branch coverage does not handle masked conditions, i.e. branches that were exercised but didn’t actually matter for the outcome. My algorithm, however, handles this well, as you can see in the test suite.

Indeed. In the CFG approach I determine what set of conditions (possibly empty) are masked whenever a condition is exercised. Granted, the patch does not have very good source visualisation yet, that’s a problem for the future.

hi @jkv

I also have a recent need to use MC/DC functionality in GCC. Could you please let me know in which version of GCC this patch has been included? I have been searching for relevant information, but I haven’t found any mention of it in the release notes of any version.

gcov’s MC/DC features have been merged in April in 08a5233. The original patches in the mailing list can be found here 1/2 and 2/2 (v10).

Hi @evodius96. First, thanks for all your great work!

I have a few follow-up questions related to MC/DC. It would be much appreciated if you could share some of your advice.

Unique-cause MC/DC vs. masking MC/DC

To my understanding, the current LLVM implementation (and all what is quoted below) adopts the more rigorous definition – unique-cause MC/DC.

On the other hand, there’s another variant called masking MC/DC, which is adopted by @jkv’s gcov implementation. FAA for example allows this variant in aviation software certification.

Essentially, in your example of

bool test(bool A, bool B, bool C) {
    return ((A && B) || C);
}

to consider C being covered, masking MC/DC only required the whole (A && B) to be false.

More accurate definition for each variant and FAA recommendations can be found here.

My questions are as follows:

  • Q: Can you please confirm LLVM is currently following the unique-cause criteria?
  • Q: Are you aware of plans or ongoing effort of implementing masking MC/DC in LLVM?

Decision coverage

  • Q: Are you aware of plans or ongoing effort of implementing decision coverage in LLVM?

Decision coverage is required by some industrial standards in lower assurance levels than MC/DC. However, current “branch coverage” (both in gcov and LLVM) indeed tells program execution in terms of conditions.

To support decision coverage in LLVM, I can think of two approaches: one building on top of MC/DC, and the other building on top of branch/condition coverage.

For the first one, the advantage is that your code in LLVM already keeps track of the “Result” column of test vectors. But right now it has a limit on the number of conditions (2~6). (Fwiw, in my use case, 1-condition decisions are somewhat more important than the >6-condition decisions)

For the second one, we don’t have that limit, but each condition is tracked independently, and we have to evaluate the results of decisions afterwards.

Do you have suggestions?

Thanks,
Wentao

Btw, with @evodius96’s fantastic work, my team has been able to measure MC/DC of Linux kernel. If anyone feels interested, please check GitHub - xlab-uiuc/linux-mcdc: Measure Linux kernel's modified condition/decision coverage (MC/DC) and share your thoughts!

1 Like

Hi @whentojump!

  • Q: Can you please confirm LLVM is currently following the unique-cause criteria?

I believe this is correct, as far as I understand; what I implemented in clang follows what other vendors have supported, especially LDRA. There aren’t any improvements to mask out conditions using boolean logic evaluation of whole subexpressions. However, in the design doc, I inaccurately described it as masking, but this only applies to the fact that due to the short-circuit behavior of logical operators, unevaluatable conditions are masked-out and counted as dont-cares.

  • Q: Are you aware of plans or ongoing effort of implementing masking MC/DC in LLVM?

I’m not aware of any. I don’t think it would be too difficult to do, but I haven’t thought about it in depth.

  • Q: Are you aware of plans or ongoing effort of implementing decision coverage in LLVM?

I’m not aware of anything – due to the short-circuited nature of C/C++, 100% branch coverage will also give you 100% decision coverage, and many standards (certainly in embedded space) already require branch coverage. I can see how having a separate metric for decisions is desirable, requiring fewer test cases to achieve. I think this could be easily extended from branch coverage with a branch region representing the decision. I don’t think you need the implementation complexity MC/DC requires in order to track decision coverage.

2 Likes