Getting basic block using -fsanitize-coverage=inline-bool-flag

Hello!

I have a project in which code coverage is added. Unfortunately, the default --coverage option does not fit as it outputs a lot of information per run (and slows
the binary execution due to atomic operations), so I decided to switch to SanitizerCoverage.

This sanitizer has an option inline-bool-flag which builds a bool array of visited entities (currently I’m using basic block coverage).

However, I did not find a good way to use this information.

If I process the .gcno files generated by using -ftest-coverage, I can’t get the
basic block index in the resulting binary section that’s passed to sanitizer callback. Thus I’m unable to determine to which basic block (or to which function/translation unit) the cell is linked.

Assuming the basic blocks in the .gcno files are present in the same order as they are in the function, I would only need the counters section offset per function, but this information is also missing.

If I got the code right, there is no info about individual functions’ sections’ offset.

I came up with a following algorithm:

  1. Use the pc-table in addition to inline-bool flag.
  2. Use an internal symbolizer in the binary for each pc address to determine the function and the translation unit it belongs to.
  3. Form an array of basic blocks belonging to a function.
  4. Merge this information with .gcno files (again, assuming the order is preserved).

I believe this algorithm is highly suboptimal and does a lot of work in runtime, so I’m asking for advice: is there an easier way to get source coverage info from inline-bool array?

The only idea that came up to me is to write another transform pass based on current sanitizer coverage pass.
In this pass the compiler would emit a tuple (translation unit, function name, lineset, basic block index in the resulting section for inline-bool-array) for each basic block into some file.