RFC - Improvements to PGO profile support

Hi David,
The plan makes sense. A question on:

4) implement frequency/profile update interfaces for Inline
transformations -- i.e., allowing importing callee's profile info into
caller with scaling factor computed from callsite count and callee
entry count

Do you envision the following implementation for it? Say we've decided to
inline callee B at callsite C in caller A. In addition to the current
inlining update work, we need to:

1. get block frequency BF of the basic block containing the callsite C by
invoking BlockFrequency on function A;
2. scale BF with the function entry count of A, to get scaledBF
3. decrease function entry count for B with scaledBF
4. compute and scale block frequencies of the basic blocks of B inlined
into A, or re-invoke BlockFrequency on the modified function A.

Thanks,
Ivan

Hi David,
The plan makes sense. A question on:

4) implement frequency/profile update interfaces for Inline
transformations -- i.e., allowing importing callee's profile info into
caller with scaling factor computed from callsite count and callee
entry count

Do you envision the following implementation for it? Say we've decided to
inline callee B at callsite C in caller A. In addition to the current
inlining update work, we need to:

1. get block frequency BF of the basic block containing the callsite C by
invoking BlockFrequency on function A;
2. scale BF with the function entry count of A, to get scaledBF
3. decrease function entry count for B with scaledBF
4. compute and scale block frequencies of the basic blocks of B inlined
into A, or re-invoke BlockFrequency on the modified function A.

The update should work for both PGO and non-PGO. Without profile
feedback, the callee's BF also needs to be re-scaled and merged with
caller's BF (since we can not afford to recompute BF for the caller
with every inline).

The assumption is that with the new pass manager ready, BF info for
caller A and callee B can co-exist.

Assuming BB is a basic block in callee B, BB' is the cloned block of
BB in the inline instance of B. Freq_A represents block frequency in
caller A, and Freq_B represents block frequency in B.

Without PGO, the incremental update is:

Freq_A(BB') = Freq_B(BB)*Freq_A(C)/Freq_B(Entry_B)

With PGO, the Frequency update is the same, but with additional update
on the callee's Entry count:

Count(Entry_B) = Count(Entry_B) - Count(Callsite_C)
                          = Count(Entry_B) -
Count(Entry_A)*Freq_A(Callsite_C)/Freq_A(Entry_A)

David