Extend MD_prof branch weights metadata from 32 to 64 bits

Dinistro · January 4, 2023, 8:24am

All the PGO counters are using 64-bit values to store their respective counts, but the corresponding materializations in the IR in the form of branch weights are scaled to fit 32-bits.

The scaling factors are computed per branch operation, which minimizes the precision loss when computing branch probabilities. The disadvantage of the approach is that the absolute branch weights cannot be compared globally. For example, it is hard to detect a hot loop in a function. While this scaling approach preserves probabilities, it seems overly restrictive for other kinds of usages of the counter values.

While I could not find any recent discussions, I found two posts from 2015 (Loss of precision with very large branch weights - #15 by davidxl, RFC - Improvements to PGO profile support - #3 by dnovillo) that mentioned such a change might be possible.

The linked posts only raise memory concerns, and it’s unclear if these are still relevant today. Are there counter-arguments for changing the MD_prof metadata to hold i64 values?

Dinistro · January 10, 2023, 5:00pm

To simplify such a transition, I opened ⚙ D141393 [llvm][ir] Purge MD_prof custom accessors, which aims to unify the way the branch weights are accessed. By reducing the number of places that access the branch weights, changing their representation should become less error-prone.

aeubanks · January 11, 2023, 12:11am

I’d attempted to use 64 bit branch weights in ⚙ D88609 Use uint64_t for branch weights instead of uint32_t. IIRC I ran into issues where we were doing math on 32 bit weights but with 64 bit ints to avoid overflow issues.

Dinistro · January 11, 2023, 7:45am

Thank you very much for pointing me to this revision. Indeed there are a few places where branch weights are multiplied and thus can overflow.

Such examples are most prominent in SimplifyCFG; in almost all places where the weights are re-scaled, a multiplication happened before.

Given that the branch weights are either synthesized or produced by PGO counters, it should be reasonable to enforce that the sum of all weights is less than UINT64_MAX. Note that even PGOInstrumentation.cpp has code that assumes this. To model absolute branch weights, 64 bits are enough, assuming they are updated correctly by transformations.

I do see two possible ways of pushing for such a change:

Manually checking and changing all usages of the weights to ensure that they do not rely on uint32s to perform computations.
Deprecate the accessor that spits out scaled weights and gradually move the usage away to work with the 64-bit version. Note that this still requires a manual check of some cases that directly load the metadata.

Topic		Replies	Views
RFC: Update `branch_weights` metadata to allow tracking branch weight origins LLVM Project pgo , llvm-ir	6	381	January 19, 2024
Profiling with LLVM. LLVM Dev List Archives	4	67	February 7, 2016
[PGO] Details of IR representation of profile data IR & Optimizations pgo , llvm	5	179	October 12, 2023
Branch probabilities for invokes LLVM Dev List Archives	1	85	July 2, 2018
Loss of precision with very large branch weights LLVM Dev List Archives	15	183	April 25, 2015

Extend MD_prof branch weights metadata from 32 to 64 bits

Related topics