unexpected coverage results for inline function with conditional #ifdef

Hi all,

I have faced unexpected coverage results for inline function with conditional #ifdef

// – header.h

#pragma once

inline int bar(int x) {
int y = 10;
#ifdef IE
if (x)
y *= 2;
#else
y -= 5;
#endif
return x * y;
}

// – lib1.cpp
#include “header.h”

int lib1_foo(int x) {
return bar(x) + x;
}

// – main.cpp
#include
#include “header.h”

extern int lib1_foo(int);

int main_foo(int x) {
return bar(x) + x;
}

int main(int c, char**) {
std::cout << lib1_foo(c) << " " << main_foo(c) << std::endl;
return 0;
}

clang++ -fprofile-instr-generate -fcoverage-mapping -O2 -c -g lib1.cpp -o lib1.o -DIE
clang++ -fprofile-instr-generate -fcoverage-mapping -O2 -c -g main.cpp -o main.o
clang++ -fprofile-instr-generate -fcoverage-mapping -O2 -g main.o lib1.o -o bin
./bin
llvm-profdata merge -o merged.prof default.profraw
llvm-cov export -instr-profile merged.prof -object bin

Output of the program will be “21 6”
However, segments for header.h will be:
[3,23,2,1,1],[5,2,0,0,1],[8,2,2,1,0],[12,2,0,0,0] (“y -= 5;” was performed twice)

Looks like coverage wasn’t able to connect source code with counters properly.

-Sam

Hi,

Thanks for the report, I’ll take a look.

vedant

Hello Vedant,

I tried to look into this with Sam and hope my initial analysys will help.

It seems to me that we face some kind of profile counter aliasing:

On slightly different example I see the following in .ll files:
** In inlined code same counters are incremented in 2 differently inlined codes (dpending on define like in the code Sam provided). They look like:

%pgocount.i = load i64, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 0), align 8, !dbg !23
%1 = add i64 %pgocount.i, 1, !dbg !23
store i64 %1, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 0), align 8, !dbg !23 <<<<< Counter #0

if.then.i: ; preds = %entry
%pgocount8.i = load i64, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 1), align 8, !dbg !27
%2 = add i64 %pgocount8.i, 1, !dbg !27
store i64 %2, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 1), align 8, !dbg !27 <<<<< Counter #1

The 2nd counter naturally belongs to different code paths in different inlining destination files.

** The counters structure looks like following in both files:
@__profd__Z22inline_foo_from_headeri = linkonce_odr hidden global { i64, i64, i64*, i8*, i8*, i32, [2 x i16] } { i64 -3479956896993626287, i64 10, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i32 0, i32 0), i8* bitcast (i32 (i32)* @_Z22inline_foo_from_headeri to i8*), i8* null, i32 2, [2 x i16] zeroinitializer }, section “__llvm_prf_data”, comdat($__profv__Z22inline_foo_from_headeri), align 8

So the counters are definitely shared between inlined instances, This looks reasonable.

** The __llvm_coverage_mapping structures are indeed different, but refere to same counters:

@__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x <{ i64, i32, i64 }>], [192 x i8] } { { i32, i32, i32, i32 } { i32 2, i32 148, i32 44, i32 1 }, [2 x <{ i64, i32, i64 }>] [<{ i64, i32, i64 }> <{ i64 -2070989330109684161, i32 9, i64 0 }>, <{ i64, i32, i64 }> <{ i64 -3479956896993626287, i32 31, i64 10 }>], [192 x i8] c"\02D/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/lib1/lib1.cppM/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
\01\00\00\01
\01\03\15\02\02\

01\01\ << This is about /xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
01\01\05\ << One expression(counter #0, #1) — There was an else branch in one of my cases

05
01\03\2B\0E\02\ << This is for counter #0
01\03\09\00\0A\ << This is for counter #0
05\01\09\00\14\ << This is for counter #1
02\02\09\00\14\ << This is subtraction for counters #0-#1 according to expression above
10\01\02\04\02\ << This is skipped region because of #ifdef
00\00\00\00" }, section “__llvm_covmap”, align 8

And

@__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x <{ i64, i32, i64 }>], [184 x i8] } { { i32, i32, i32, i32 } { i32 2, i32 148, i32 36, i32 1 }, [2 x <{ i64, i32, i64 }>] [<{ i64, i32, i64 }> <{ i64 7422656158144208762, i32 9, i64 0 }>, <{ i64, i32, i64 }> <{ i64 -3479956896993626287, i32 24, i64 10 }>], [184 x i8] c"\02D/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/lib2/lib2.cppM/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
\01\00\00\01
01\03\15\02\02\

01\01\ << This is about /xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
00\ << no expressions in this case
04
01\03\2B\0E\02\ << This is for counter #0
10\02\02\05\02\ << This is skipped region because of #ifdef
01\06\09\00\0A\ << This is for counter #0
05\01\09\00\14\ << This is for counter #1
00\00\00" }, section “__llvm_covmap”, align 8

The given example has undefined behavior; C++ ODR says that inline functions are required to have the same definition in every translation unit. (You can avoid this problem by making the function in question "static inline".) You're lucky that the only consequence you're seeing is that the coverage output is wrong; it could very easily lead to a crash or memory corruption instead.

-Eli

Hi,

Thank you Serge and Eli for taking a look! Some comments inline --

The given example has undefined behavior; C++ ODR says that inline functions are required to have the same definition in every translation unit. (You can avoid this problem by making the function in question "static inline".) You're lucky that the only consequence you're seeing is that the coverage output is wrong; it could very easily lead to a crash or memory corruption instead.

-Eli

Hello Vedant,

I tried to look into this with Sam and hope my initial analysys will help.

It seems to me that we face some kind of profile counter aliasing:

On slightly different example I see the following in .ll files:
** In inlined code same counters are incremented in 2 differently inlined codes (dpending on define like in the code Sam provided). They look like:

  %pgocount.i = load i64, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 0), align 8, !dbg !23
  %1 = add i64 %pgocount.i, 1, !dbg !23
  store i64 %1, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 0), align 8, !dbg !23 <<<<< Counter #0
...
if.then.i: ; preds = %entry
  %pgocount8.i = load i64, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 1), align 8, !dbg !27
  %2 = add i64 %pgocount8.i, 1, !dbg !27
  store i64 %2, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i64 0, i64 1), align 8, !dbg !27 <<<<< Counter #1

The 2nd counter naturally belongs to different code paths in different inlining destination files.

As Eli pointed out, this sort of result is not expected from well-formed programs. There shouldn't be any ambiguity about which code region a counter is associated with.

** The counters structure looks like following in both files:
@__profd__Z22inline_foo_from_headeri = linkonce_odr hidden global { i64, i64, i64*, i8*, i8*, i32, [2 x i16] } { i64 -3479956896993626287, i64 10, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @__profc__Z22inline_foo_from_headeri, i32 0, i32 0), i8* bitcast (i32 (i32)* @_Z22inline_foo_from_headeri to i8*), i8* null, i32 2, [2 x i16] zeroinitializer }, section "__llvm_prf_data", comdat($__profv__Z22inline_foo_from_headeri), align 8

So the counters are definitely shared between inlined instances, This looks reasonable.

** The __llvm_coverage_mapping structures are indeed different, but refere to same counters:

@__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x <{ i64, i32, i64 }>], [192 x i8] } { { i32, i32, i32, i32 } { i32 2, i32 148, i32 44, i32 1 }, [2 x <{ i64, i32, i64 }>] [<{ i64, i32, i64 }> <{ i64 -2070989330109684161, i32 9, i64 0 }>, <{ i64, i32, i64 }> <{ i64 -3479956896993626287, i32 31, i64 10 }>], [192 x i8] c"\02D/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/lib1/lib1.cppM/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
\01\00\00\01
\01\03\15\02\02\

01\01\ << This is about /xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
01\01\05\ << One expression(counter #0, #1) --- There was an else branch in one of my cases

05\
01\03\2B\0E\02\ << This is for counter #0
01\03\09\00\0A\ << This is for counter #0
05\01\09\00\14\ << This is for counter #1
02\02\09\00\14\ << This is subtraction for counters #0-#1 according to expression above
10\01\02\04\02\ << This is skipped region because of #ifdef
00\00\00\00" }, section "__llvm_covmap", align 8

And

@__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x <{ i64, i32, i64 }>], [184 x i8] } { { i32, i32, i32, i32 } { i32 2, i32 148, i32 36, i32 1 }, [2 x <{ i64, i32, i64 }>] [<{ i64, i32, i64 }> <{ i64 7422656158144208762, i32 9, i64 0 }>, <{ i64, i32, i64 }> <{ i64 -3479956896993626287, i32 24, i64 10 }>], [184 x i8] c"\02D/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/lib2/lib2.cppM/xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
\01\00\00\01\
01\03\15\02\02\

01\01\ << This is about /xxxx/xxxxxx/xxx/xxxx/xxxxxxxxx/test/compilation_flags/header/single_header.h
00\ << no expressions in this case
04\
01\03\2B\0E\02\ << This is for counter #0
10\02\02\05\02\ << This is skipped region because of #ifdef
01\06\09\00\0A\ << This is for counter #0
05\01\09\00\14\ << This is for counter #1
00\00\00" }, section "__llvm_covmap", align 8

***
So now we have 2 different coverage mappings referring same counters, but attributing them to different paths in the code. This doesn't seem right to me and I don't see any possibility for this case to work properly after linking. I am not even sure that this situation is ever fixable: in order to assign different counters instrumentation pass should know a lot about #ifdef-ed out code, while it is not even a part of AST. Or alternatively counters for inlined code shouldn't be shared among inlining sites, but this would have created a lot of troubles for accounting and extremely infalted counter structures.

What worries me here is why some mapping regions are ignored in this case: they all are in the final binary, but excluded from accounting by some reason. Even if they stayed coverage would be incorrect but in some better way. It would mistakely cover #ifdef-ed out code as visited, but this is definitely better than additionally reporting some visited line as unvisited.

The current behavior is accounted for in VersionedCovMapFuncRecordReader::insertFunctionRecordIfNeeded:

  // Add the record to the collection if we don't already have a record that
  // points to the same function name. This is useful to ignore the redundant
  // records for the functions with ODR linkage.

I think it might open up a can of worms if we tried to handle types of ODR violations here.

best,
vedant

Thank you, Vedant and Eli.

I agree that since this is undefined behavior this shouldn’t be supported and should be fixed in source code (if such codes arise in our code base).

Reagards,