Profile-based inlining status

Hello,

I’m learning how LLVM performs PGO (profile-guided optimizations) by using the instrumentation-based profile build (-fprofile-instr-generate and -fprofile-instr-use).
However, I found there is no difference in inlining behaviors between with and without PGO for a few spec benchmarks by checking the emit optimization reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline). Also, the profile information collected contains only block counters, not call counters. This seems to indicate that the profile-based inlining is not supported yet (my LLVM/Clang is 3.9.0, mid Feb trunk). Is this the case?

The talk in the LLVM conference 2013 (http://llvm.org/devmtg/2013-11/slides/Carruth-PGO.pdf) actually describes that “the inliner doesn’t even know profile information exists today” (page 89), but it’s been more than two years since that statement. Can you let me know the latest status or any ongoing work for the profile-based inlining (both same-module and cross-module inlining)?

Thank you,
–Toshio

Hello,

I’m learning how LLVM performs PGO (profile-guided optimizations) by using the instrumentation-based profile build (-fprofile-instr-generate and -fprofile-instr-use).
However, I found there is no difference in inlining behaviors between with and without PGO for a few spec benchmarks by checking the emit optimization reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline). Also, the profile information collected contains only block counters, not call counters.

Since a call is necessarily in a block, I’m not sure what extra information you would get with a call counter?

This seems to indicate that the profile-based inlining is not supported yet (my LLVM/Clang is 3.9.0, mid Feb trunk). Is this the case?

The talk in the LLVM conference 2013 (http://llvm.org/devmtg/2013-11/slides/Carruth-PGO.pdf) actually describes that “the inliner doesn’t even know profile information exists today” (page 89), but it’s been more than two years since that statement. Can you let me know the latest status or any ongoing work for the profile-based inlining (both same-module and cross-module inlining)?

I believe it was added recently (last December, see http://reviews.llvm.org/D15245 ), so a trunk version should definitely have it.

You are right that inliner does not yet know profile information at callsite level (only knows about if callee is hot – later is also not correct as it is not updated at all).

We are working on this. A patch was committed to enable profile/profile update in inliner recently, but got temporarily reverted for further discussion. We will announce it once the feature is in trunk.

thanks,

David

Hello,

I'm learning how LLVM performs PGO (profile-guided optimizations) by using
the instrumentation-based profile build (-fprofile-instr-generate and
-fprofile-instr-use).
However, I found there is no difference in inlining behaviors between with
and without PGO for a few spec benchmarks by checking the emit optimization
reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline). Also,
the profile information collected contains only block counters, not call
counters.

Since a call is necessarily in a block, I'm not sure what extra
information you would get with a call counter?

This seems to indicate that the profile-based inlining is not supported
yet (my LLVM/Clang is 3.9.0, mid Feb trunk). Is this the case?

The talk in the LLVM conference 2013 (
http://llvm.org/devmtg/2013-11/slides/Carruth-PGO.pdf) actually describes
that "the inliner doesn't even know profile information exists today" (page
89), but it's been more than two years since that statement. Can you let me
know the latest status or any ongoing work for the profile-based inlining
(both same-module and cross-module inlining)?

I believe it was added recently (last December, see
⚙ D15245 Use the inlinehint-threshold for hot callees. ), so a trunk version should definitely
have it.

There is a more recent patch that does the whole thing (including
incremental update) that is under review.

David