Dear Duncan,
I am generating branch-weights annotated IR files as described in the
documentation of LLVM, using profiling with instrumentation.
http://clang.llvm.org/docs/UsersManual.html#profiling-with-instrumentation
e.g.
llvm-profdata merge -output=$(BENCH).profdata default.profraw
clang -S -emit-llvm -O3 -fprofile-instr-use=$(BENCH).profdata -o
bench.prof.ll bench.c
The issue is that in some benchmarks I get crazy numbers in the annotated
metadata inside the generated *.ll files.
e.g.
!16 = !{!"branch_weights", i32 -2147483648, i32 0}
!155 = !{!"branch_weights", i32 1075807200, i32 -1501637297}
!181 = !{!"branch_weights", i32 -965299388, i32 218980800}
This should be a counter overflow.
It is not counter overflow. Branch weights are not the same as branch
profile counts. Branch weights are intended to represent branch probability
and the absolute value of 'weight' does not mean anything. For branch
weights that come from real profile data, they may look like real profile
counts if not scaled. The negative value is a problem in dumping -- it
should be printed as uint32.
In fact, BPI and MBPI no longer have weight based interfaces (since the
concept of weight is confusing). However 'weight' remains in the meta data
representation.
Now the interesting thing is that by using these annotated files as input
for the BasicBlockFrequency analysis pass,the output seems to give correct
numbers, regarding the Frequency execution of each Basic Block, even though
few of the counters have overflowed.
The correct frequency information is expected except for a couple of known
cases where block frequency propagation does not work well. For instance
handling irreducible loops, infinite loops (in general branch with zero
weights) etc.
To get the real block and edge/branch profile count, you should look at the
computed frequency data and combine it with function's
'function_entry_count' meta data. The later is the real profile count of
the entry block.
This seems like a bug, unless I need to do specific configurations while
running the profiling part before the analysis.
From your experience, would you say that the BasicBlockFrequency analysis
pass output is to be trusted? Is it known to be stable or do I need to be
really cautious and always inspect the output? Are there any common cases
of not having accurate profiling?
For common cases, it should be trusted. If you see problems, please file
bugs.
thanks,
David