Hi guys, I looked at branch weight document(LLVM Branch Weight Metadata — LLVM 16.0.0git documentation) and block frequency document(LLVM Block Frequency Terminology — LLVM 16.0.0git documentation), and I have a question that is that we can only use BlockFrequencyInfo(BFI) when we have profile data?
Since I know that branch weight can be assigned by both profile data and
expect intrinsic and can be used just with
__builtin_expect. Is it also okay to use BFI without profile data? If so, is there any side effects?
Also, is that we need to manually maintain BFI whenever the CFG changes?
Thank you very much!
Yes, it is ok to use BFI without profile data. In this case, the static branch probability analysis uses static heuristics to ‘guess’ branch weights (combining with user directive __builtlin_expect). The resulting BPI is then used to compute BFI. In practice, many heuristics works reasonably well (e.g. loop related, EH related, comparison related etc).
Thank you for the information! And may I ask do you know is that BFI need to be manually calculated and maintained when ever CFG changes? Or I can just change some BPI and then for example run a function to recalculate all BFI?
Profile meta data are expected to be maintained in the IR with all transformations (cloning etc), but not guaranteed for BPI, nor BFI (they will usually be invalidated after cfg transformation).
Besides, some optimization passes need to maintain BFI during transformation (e.g. iterative algorithm) and will need to use BFI APIs to incrementally update the frequency information.
Got it, thank you very much!
I bring this issue up because recently I found some sub-pass in SimplifyCFG (FoldCondBranchOnValueKnownInPredecessor, which is a simple case of jump threading) does not update branch weight correctly and cause the final block frequency incorrect.
I saw in JumpThreading there is an
updateBlockFreqAndEdgeWeight that update probability using BFI and BPI. But SimplifyCFG does not have BFI now, so we are currently thinking of supporting it in SimplifyCFG. Just wondering do you have any suggestions on that?
if SimplifyCFG can benefit from using BPI/BFI, it is reasonable to add that support (subject to the benefit and compile time tradeoff).
I see, then I would like to try it first. Thank you for the help!
Hi David, sorry I forgot one thing. Paul and I discussed this issue in this patch: ⚙ D131287 Fix branch weight in FoldCondBranchOnValueKnownInPredecessor pass in SimplifyCFG (which may not be the best way). And one concern is that BFI may not compatible with SimplifyCFG since BFI seems assume that loops are simplified: llvm-project/BlockFrequencyInfo.h at main · llvm/llvm-project · GitHub while this is not guaranteed done and may also happen inside this pass.
May I ask could this prevent us from using BFI?
That is implementation limitation and should probably be fixable if you encounter issues – especially in BPI.