We have re-optimized the implementation of this technology. In the new implementation, we have made every effort to avoid introducing new compilation parameters while ensuring compatibility with the current usage of Propeller. Instead of introducing a new profile format, we reused the existing profile format employed by Propeller. Specifically, we enabled the CFG profile switch in the cluster profile and added a new line containing the hash value for each basic block. The format of the new line is:
h <bb_id>:<bb_hash> <bb_id>:<bb_hash> <bb_id>:<bb_hash> ...
A simple complete profile example is:
V1
f foo
c0 1 3
w 0:100,1:100,2:0 1:100,3:100 2:0,3:0 3:100
h 0:10838971302264569856 1:10005270435075325958 2:10005130895882846216 3:17705462075698380809
Through optimization, the code changes have been significantly reduced, and the usage of this technology remains largely consistent with the original Propeller. Specific examples are as follows:
- Build the base binary: Use the -emit-bb-hash parameter to add basic block hashes in the basic block address map.
clang++ -O2 -gline-tables-only -fdebug-info-for-profiling -funique-internal-linkage-names -fbasic-block-address-map -mllvm,-emit-bb-hash code.cc -o base_code
- Generate the profile: Use --write_cfg_profile to enable Propeller’s existing CFG profile, and use --write-bb-hash to generate hash values for each basic block.
perf record -b ./base_code
./generate_propeller_profiles \
--write_cfg_profile \
--write-bb-hash \
--binary=./base_code \
--profile=perf.data \
--cc_profile=cluster.txt \
--ld_profile=symorder.txt
- Recompile for optimization: The only difference from the original Propeller is the use of -propeller-match-infer to enable matching and inference.
clang++ -O2 -gline-tables-only -fdebug-info-for-profiling -funique-internal-linkage-names \
-fbasic-block-sections=list=cluster.txt -mllvm,-propeller-match-infer \
-Wl,--no-warn-symbol-ordering -Wl,--symbol-ordering-file=symorder.txt -fuse-ld=lld code.cc -o base_code
We have submitted a patch for our new implementation (https://github.com/llvm/llvm-project/pull/160706) and also forked the llvm-propeller repository to add changes to the Propeller profile tool (https://github.com/wdx727/llvm-propeller/tree/propeller-match-infer). This change is planned to be submitted after the LLVM patch is merged.
We welcome everyone to review our new implementation. @rlavaee @tmsriram