[llvm-cov] Hash mismatches originating from class methods implemented in header files

Aleksa_Markovic · June 26, 2024, 12:36pm

Hello everyone,

I’m working on migrating to Source-based code coverage from gcov-compatible coverage w/ clang 13.0.1 for a large C++ project and we’ve been facing with a problem with llvm-cov reporting hash mismatches for methods implemented in header files that is causing incomplete coverage reports - the counter groups for these methods are completely skipped throughout all instatiations. Related issues to the problem are: #72786 and #32849.
I’ve compiled a minimal example and my understanding of it, and I would like to ask for advice on how to mitigate this.

We’re running code coverage builds to get coverage reports for libraries dynamically linked to test executables. The test executables themselves do not get instrumented, because we are not interested in numbers for the tests and the instrumentation slows down their build. The libraries are instrumented and only they are used when invoking llvm-cov.

Consider a mock library that consists of a class with most methods implemented inside a header file:
test_class.h

#ifndef __TEST_CLASS_H_
#define __TEST_CLASS_H_

class TestClass {
    public:
        TestClass(bool x) { if (x) a = 1; else a = 0; }

        int TestMethod(int x) { if (a == 1)  return x;  else  return 0; }

        int NotInHeader(int a, int b, int c);

        static int TestStaticMethod(int x) { return x + 2; }

    private:
        int a;
};

#endif

test_lib.cc

#include "test_class.h"

int TestClass::NotInHeader(int a, int b, int c) {
    this->a = a + b + c;
    return c;
}

We build this library with instrumentation:

$ clang -fPIC -shared -fprofile-instr-generate -fcoverage-mapping -o libtest_lib.so test_lib.cc

The only symbol from the class is the NotInHeader method:

$ readelf -s libtest_lib.so  | grep TestClass
    55: 00000000000015a0    55 FUNC    GLOBAL DEFAULT   10 _ZN9TestClass11NotInHeade
   241: 00000000000015a0    55 FUNC    GLOBAL DEFAULT   10 _ZN9TestClass11NotInHeade

Consider another library which instantiates some of the methods defined in the header:
another_lib.cc

#include "test_class.h"

void utility(void) {
    TestClass t(1);
    t.TestMethod(0);
    TestClass::TestStaticMethod(1); }

It also gets built instrumented:

$ clang -fPIC -shared -fprofile-instr-generate -fcoverage-mapping -o libanother_lib.so  another_lib.cc
$ readelf -sW libanother_lib.so | grep TestClass
    54: 0000000000001750    97 FUNC    WEAK   DEFAULT   10 _ZN9TestClassC2Eb
    55: 0000000000001820    33 FUNC    WEAK   DEFAULT   10 _ZN9TestClass16TestStaticMethodEi
    60: 00000000000017c0    83 FUNC    WEAK   DEFAULT   10 _ZN9TestClass10TestMethodEi
   109: 0000000000209248    16 OBJECT  LOCAL  DEFAULT   24 __profc__ZN9TestClass10TestMethodEi
   125: 0000000000209258     8 OBJECT  LOCAL  DEFAULT   24 __profc__ZN9TestClass16TestStaticMethodEi
   142: 0000000000209238    16 OBJECT  LOCAL  DEFAULT   24 __profc__ZN9TestClassC2Eb
   215: 0000000000001750    97 FUNC    WEAK   DEFAULT   10 _ZN9TestClassC2Eb
   217: 0000000000001820    33 FUNC    WEAK   DEFAULT   10 _ZN9TestClass16TestStaticMethodEi
   239: 00000000000017c0    83 FUNC    WEAK   DEFAULT   10 _ZN9TestClass10TestMethodEi

Here is our mock test executable that links both of these libraries and instantiates header-defined methods but doesn’t get instrumented:
executable.cc

#include "test_class.h"
#include <stdio.h>

extern void utility(void);

int main() {
    TestClass t(1);
    printf("%d\n",t.TestMethod(1));
    t.NotInHeader(1,2,3);
    utility();
}

$ clang -L./ -ltest_lib -lanother_lib -o exe executable_1.cc

Running it we get a profraw file with all counters of interest:

$ LD_LIBRARY_PATH=./ ./exe
$ llvm-profdata show default.profraw --all-functions --counts
Counters:
  _ZN9TestClass11NotInHeaderEiii:
    Hash: 0x0000000000000018
    Counters: 1
    Function count: 1
    Block counts: []
  _Z7utilityv:
    Hash: 0x0000000000000000
    Counters: 1
    Function count: 1
    Block counts: []
  _ZN9TestClassC2Eb:
    Hash: 0x00000000002924d1
    Counters: 2
    Function count: 0
    Block counts: [0]
  _ZN9TestClass10TestMethodEi:
    Hash: 0x000000a7d2613611
    Counters: 2
    Function count: 0
    Block counts: [0]
  _ZN9TestClass16TestStaticMethodEi:
    Hash: 0x0000000000000018
    Counters: 1
    Function count: 1
    Block counts: []
Instrumentation level: Front-end
Functions shown: 5
Total functions: 5
Maximum function count: 1
Maximum internal block count: 0
$ llvm-profdata merge -o test.profdata default.profraw

Oddly, there’s no hits on the constructor _ZN9TestClassC2Eb and _ZN9TestClass10TestMethodEi.
Here is where the problem appears: we want to export this data from the two libraries:

$ llvm-cov export -format lcov -instr-profile test.profdata libtest_lib.so  libanother_lib.so 
warning: 3 functions have mismatched data
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_lib.cc
FN:3,_ZN9TestClass11NotInHeaderEiii
FNDA:1,_ZN9TestClass11NotInHeaderEiii
FNF:1
FNH:1
DA:3,1
DA:4,1
DA:5,1
DA:6,1
BRF:0
BRH:0
LF:4
LH:4
end_of_record

There is a mismatch for the header-defined methods and the counters from them are not exported. When this is scaled to dozens of libraries and instatiations, thousands of symbols end up mismatched and it’s not deterministic whether they’re present or not.
Now, we swap the libraries and there’s no mismatches reported:

$ llvm-cov export -format lcov -instr-profile test.profdata libanother_lib.so libtest_lib.so 
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/another_lib.cc
FN:3,_Z7utilityv
FNDA:1,_Z7utilityv
FNF:1
FNH:1
DA:3,1
DA:4,1
DA:5,1
DA:6,1
DA:7,1
BRF:0
BRH:0
LF:5
LH:5
end_of_record
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_class.h
FN:6,_ZN9TestClassC2Eb
FN:8,_ZN9TestClass10TestMethodEi
FN:12,_ZN9TestClass16TestStaticMethodEi
FNDA:0,_ZN9TestClassC2Eb
FNDA:0,_ZN9TestClass10TestMethodEi
FNDA:1,_ZN9TestClass16TestStaticMethodEi
FNF:3
FNH:1
DA:6,0
DA:8,0
DA:12,1
BRDA:6,0,0,-
BRDA:6,0,1,-
BRDA:8,0,0,-
BRDA:8,0,1,-
BRF:4
BRH:0
LF:3
LH:1
end_of_record

There’s different behavior when the libraries are specified with --object:

$ llvm-cov export -format lcov -instr-profile test.profdata --object libanother_lib.so  --object libtest_lib.so 
warning: 3 functions have mismatched data
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/another_lib.cc
FN:3,_Z7utilityv
FNDA:1,_Z7utilityv
FNF:1
FNH:1
DA:3,1
DA:4,1
DA:5,1
DA:6,1
DA:7,1
BRF:0
BRH:0
LF:5
LH:5
end_of_record
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_class.h
FN:6,_ZN9TestClassC2Eb
FN:8,_ZN9TestClass10TestMethodEi
FN:12,_ZN9TestClass16TestStaticMethodEi
FNDA:0,_ZN9TestClassC2Eb
FNDA:0,_ZN9TestClass10TestMethodEi
FNDA:1,_ZN9TestClass16TestStaticMethodEi
FNF:3
FNH:1
DA:6,0
DA:8,0
DA:12,1
BRDA:6,0,0,-
BRDA:6,0,1,-
BRDA:8,0,0,-
BRDA:8,0,1,-
BRF:4
BRH:0
LF:3
LH:1
end_of_record
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_lib.cc
FN:3,_ZN9TestClass11NotInHeaderEiii
FNDA:1,_ZN9TestClass11NotInHeaderEiii
FNF:1
FNH:1
DA:3,1
DA:4,1
DA:5,1
DA:6,1
BRF:0
BRH:0
LF:4
LH:4
end_of_record

Mismatches are reported but the counters seem to be output correctly. However in our real use case with dozens of libraries, the number of lines output still varies.
In these situations, what is the best way to invoke llvm-cov to get the most complete and consistent results?

Thank you.

Aleksa_Markovic · June 27, 2024, 11:04am

I think the reason for these hash mismatches is profile counters and their coverage mapping being generated for methods defined inside the class, but not used in the library itself. In my previous example, libtest_lib.so gets built with profile counters and mapping for all the methods in the class - the ones not instantiated are with 0 hashes:

$ clang -fPIC -shared -fprofile-instr-generate -fcoverage-mapping -mllvm -enable-name-compression=false -o libtest_lib.so test_lib.cc
$ objdump -s -j __llvm_prf_names ./libtest_lib.so 

./libtest_lib.so:     file format elf64-x86-64

Contents of section __llvm_prf_names:
 6d70 6e005f5a 4e395465 7374436c 61737331  n._ZN9TestClass1
 6d80 314e6f74 496e4865 61646572 45696969  1NotInHeaderEiii
 6d90 015f5a4e 39546573 74436c61 73734332  ._ZN9TestClassC2
 6da0 4562015f 5a4e3954 65737443 6c617373  Eb._ZN9TestClass
 6db0 31305465 73744d65 74686f64 4569015f  10TestMethodEi._
 6dc0 5a4e3954 65737443 6c617373 31365465  ZN9TestClass16Te
 6dd0 73745374 61746963 4d657468 6f644569  stStaticMethodEi
$ objdump -s -j __llvm_prf_data ./libtest_lib.so 

./libtest_lib.so:     file format elf64-x86-64

Contents of section __llvm_prf_data:
 209218 cdc38f8a 0a23631a 18000000 00000000  .....#c.........
 209228 10922000 00000000 00000000 00000000  .. .............
 209238 00000000 00000000 01000000 00000000  ................
$ objdump -s -j __llvm_covmap ./libtest_lib.so 

./libtest_lib.so:     file format elf64-x86-64

Contents of section __llvm_covmap:
 0000 00000000 4f000000 00000000 05000000  ....O...........
 0010 034c0032 2f686f6d 652f616c 656b7361  .L.2/home/aleksa
 0020 2e6d6172 6b6f7669 632f7072 6f6a6563  .markovic/projec
 0030 74732f63 632f6c6c 766d5f72 65616465  ts/cc/llvm_reade
 0040 725f7465 73740b74 6573745f 6c69622e  r_test.test_lib.
 0050 63630c74 6573745f 636c6173 732e6800  cc.test_class.h.

I believe these phantom-entries cause mismatches when exported together with actual instantiations.Continuing further, we can see that the profraw file originating from libtest_lib.so contains only the instantiated method NotInHeader:

$ clang -fPIC -shared -fprofile-instr-generate -fcoverage-mapping -o libanother_lib.so  another_lib.cc
$ clang -L./ -ltest_lib -lanother_lib -o exe executable_1.cc
$ LD_LIBRARY_PATH=./ LLVM_PROFILE_FILE=%1m.profraw ./exe
$ llvm-profdata show 1901525142239953876_0.profraw  --all-functions --counts
Counters:
  _ZN9TestClass11NotInHeaderEiii:
    Hash: 0x0000000000000018
    Counters: 1
    Function count: 1
    Block counts: []
Instrumentation level: Front-end
Functions shown: 1
Total functions: 1
Maximum function count: 1
Maximum internal block count: 0

I think that a possible mitigation is for clang to mark it similarly these entries like a ‘weak symbol’ so that llvm-cov knows which hash to consider true and not skip them.

Aleksa_Markovic · June 28, 2024, 12:42pm

Adding __attribute__((used)) to each method forces clang to emit the code and resolves the mismatches. The command line argument -femit-all-decls achieves the same effect. However, this is not an ideal solution because of unused code being emitted.

Aleksa_Markovic · July 3, 2024, 1:32pm

Hi folks,

I’ve confirmed that this issue still exists in master and Ubuntu clang 18. I’m pretty sure that the problem is that the hashes for the instrumented symbols are generated from LLVM IR and when no code is emitted, the hash is 0; however the same source code could be emitted elsewhere with a correct hash and this is the cause of the mismatches.

The solution with -femit-all-decls is not viable for my use case, due to various headers not being able to cross compile all their symbols for various reasons.

I’ve submitted a PR as a workaround for this. I would like to discuss a proper solution for this, possibly introducing another hash that can be used for matching symbols between object files and profile files. A solution could be to hash the source code instead of IR.

Thanks for anyone interested in this issue.

chapuni · July 8, 2024, 10:50pm

There are some caveats.

W/o optimizations, methods in the header will be instantiated to both another_lib (instrumented) and exe (non-instrumented). LDD will resolve weak symbols as exe’s. So, non-instrumented methods will be executed. -emit-llvm -S will help you.
In contrast, w/optimizations, methods will be inlined out and another_lib’s profdata will count up methods called from utility().

utility()'s hash value is actually zero due to Clang, since it doesn’t have branches.

I think we might modify Clang.

Don’t handle hash zero as special.
- Or don’t emit hash zero for function bodies.
Calculate hash values if functions are referenced, even if they are not instantiated.
- Avoid emitting hash-zero stubs from decls.

Investigating.

FYI, to see hashes and counters in profraw;

llvm-profdata merge --text default.profraw

(I don’t know how to dump the entire covmap with hashes. -Xclang -dump-coverage-mapping dumps only region records)

gvanmourik · July 18, 2024, 5:42pm

We’re working on a similar project, and are also running into this issue. A couple of follow-up questions for @Aleksa_Markovic:

Does this affect the cover points in the function as well as the function declaration?
Have you already tried comparing GCOV and LLVM coverage reports to check the severity of this issue? If so, did you happen to notice any other inconsistencies between the reports?

Thanks,
Garrett

Aleksa_Markovic · July 19, 2024, 12:37pm

Hello Garrett,

I’m not sure I understood your question. If by cover points you mean mapping regions of the code, here’s the difference when a function is not emitted (fixed 0 values for non-emitted methods):

$ clang -fPIC -shared -Xclang -dump-coverage-mapping  -fprofile-instr-generate -f
coverage-mapping -o libtest_lib.so test_lib.cc
_ZN9TestClass11NotInHeaderEiii:
  File 0, 3:49 -> 6:2 = #0
_ZN9TestClassC2Eb:
  File 0, 6:27 -> 6:56 = 0
_ZN9TestClass10TestMethodEi:
  File 0, 8:31 -> 8:74 = 0
_ZN9TestClass16TestStaticMethodEi:
  File 0, 12:44 -> 12:61 = 0

and if it’s emitted:

$ clang -fPIC -shared -Xclang -dump-coverage-mapping -femit-all-decls -fprofile-i
nstr-generate -fcoverage-mapping -o libtest_lib.so test_lib.cc
_ZN9TestClassC2Eb:
  File 0, 6:27 -> 6:56 = #0
  File 0, 6:33 -> 6:34 = #0
  Branch,File 0, 6:33 -> 6:34 = #1, (#0 - #1)
  Gap,File 0, 6:35 -> 6:36 = #1
  File 0, 6:36 -> 6:41 = #1
  Gap,File 0, 6:42 -> 6:48 = (#0 - #1)
  File 0, 6:48 -> 6:53 = (#0 - #1)
_ZN9TestClass10TestMethodEi:
  File 0, 8:31 -> 8:74 = #0
  File 0, 8:37 -> 8:43 = #0
  Branch,File 0, 8:37 -> 8:43 = #1, (#0 - #1)
  Gap,File 0, 8:44 -> 8:46 = #1
  File 0, 8:46 -> 8:54 = #1
  Gap,File 0, 8:55 -> 8:63 = (#0 - #1)
  File 0, 8:63 -> 8:71 = (#0 - #1)
_ZN9TestClass16TestStaticMethodEi:
  File 0, 12:44 -> 12:61 = #0
_ZN9TestClass11NotInHeaderEiii:
  File 0, 3:49 -> 6:2 = #0

Libraries that use the methods, such as another_lib in my example, emit the functions and the mapping regions (and counters) are correct.
2. I have not researched GCOV-compatible coverage in Clang, but from a quick test I see that GCOV does not exhibit this issue of hash mismatches. We’ve usually seen no difference in counter values, however there are significant differences in the coverage data formats and how the mechanism handle templates, preprocessor macros and header-defined methods.

BR,
Aleksa

Aleksa_Markovic · July 19, 2024, 12:41pm

W/o optimizations, methods in the header will be instantiated to both another_lib (instrumented) and exe (non-instrumented). LDD will resolve weak symbols as exe’s. So, non-instrumented methods will be executed. -emit-llvm -S will help you.
In contrast, w/optimizations, methods will be inlined out and another_lib’s profdata will count up methods called from utility().

This is very interesting, is there any way to make the compiler/linker always prefer instrumented code?

henry2cox · July 19, 2024, 5:45pm

Garrett was asking whether line and branch coverpoints within the functions which mismatch are reported or if those coverpoints are missing (along with the function which mismatched).

With respect to -fcoverage-mapping (profile based) vs --coverage (gcov based): we see quite a few differences in instrumentation - but we don’t typically check the actual hit counts - only whether the count is zero or not.
There are typically a LOT of artifacts in the output data from both paths.
Yet another interesting experiment is to compare GCC vs. LLVM coverage results - for the same code base and testsuite.
Again: there are more differences than one might like to see - some which are moderately easy to filter out, some which are not.

We haven’t tried your -femit-all-decls workaround - but it is pretty close to the top of the list of experiments to try

Interesting (and very useful) discussion!

Henry

Aleksa_Markovic · July 22, 2024, 12:29pm

Hello Henry,

If mismatches are detected, all coverpoints belonging to the mismatched function are not exported. In lcov tracefile terms, the FN, FNDA and associated DA and BRDA records are missing for the mismatched function. In the case of libtest_lib.so from my example, everything from the header file is mismatched, and so the SF record is missing too:

$ llvm-cov export -format lcov -instr-profile default.profdata  --object libtest_lib.so 
warning: 3 functions have mismatched data
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_lib.cc
FN:3,_ZN9TestClass11NotInHeaderEiii
FNDA:1,_ZN9TestClass11NotInHeaderEiii
FNF:1
FNH:1
DA:3,1
DA:4,1
DA:5,1
DA:6,1
BRF:0
BRH:0
LF:4
LH:4
end_of_record

Compare this when supplied with an empty .profdata file (containing no symbols, to emulate lcov’s --initial mode) which has no matches with libtest_lib.so symbols:

$ llvm-cov export -format lcov -instr-profile _dummy.profdata  --object libtest_lib.so 
warning: libtest_lib.so: profile data may be out of date - object is newer
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_class.h
FN:6,_ZN9TestClassC2Eb
FN:8,_ZN9TestClass10TestMethodEi
FN:12,_ZN9TestClass16TestStaticMethodEi
FNDA:0,_ZN9TestClassC2Eb
FNDA:0,_ZN9TestClass10TestMethodEi
FNDA:0,_ZN9TestClass16TestStaticMethodEi
FNF:3
FNH:0
DA:6,0
DA:8,0
DA:12,0
BRF:0
BRH:0
LF:3
LH:0
end_of_record
SF:/home/aleksa.markovic/projects/cc/llvm_reader_test/test_lib.cc
FN:3,_ZN9TestClass11NotInHeaderEiii
FNDA:0,_ZN9TestClass11NotInHeaderEiii
FNF:1
FNH:0
DA:3,0
DA:4,0
DA:5,0
DA:6,0
BRF:0
BRH:0
LF:4
LH:0
end_of_record

kolrami · September 3, 2024, 9:32am

Hi @Aleksa_Markovic , I also asked a question in Coverage from multiple Test Executables - #4 by kolrami which seems to be quite similar (I just saw your post too late, maybe you can have a look since you are more deeply involved here).

Is there any update on how we want to proceed here? I saw you workaround PR but there seems to be no further progress since then?

Adding -femit-all-decls is not an option for me, because it exceeds all resources of my PC on a bigger codebase.

Topic		Replies	Views
Coverage report of the `check-clang-analysis` target Community	3	316	May 6, 2022
Need help with llvm gcov c++ coverage Using Clang	0	117	August 8, 2018
GCOV instrumentation for non-instantiated code (e.g. template functions) Clang Frontend	1	85	May 13, 2012
llvm-cov gcov correct options ? Using Clang	0	125	February 20, 2015
GCC compatibility code coverage issue . LLVM Dev List Archives	5	166	April 29, 2015

[llvm-cov] Hash mismatches originating from class methods implemented in header files

Related topics