Indirect call site profiling

Hi All,

We've been working on enhancing LLVM's instrumentation based profiling by
adding indirect call target profiling support. Our goal is to add
instrumentation around indirect call sites, so that we may track the
frequently taken target addresses and their call frequencies.

The acquired data has uses in optimization of indirect function call
heavy applications. Our initial findings show that using the profile data
in optimizations would help improve the performance of some of the spec
benchmarks notably. We have a proof of concept implementation, which we
plan to put it up for review. However, I’d like to inquire prior if there
are any plans or ongoing work done in the community to enable indirect
call target profiling support or not. Please inform if cfe-dev is a better
candidate for posting PGO related emails.

Thanks,
-Betul Buyukkurt

Employee of the Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

Hi All,

We've been working on enhancing LLVM's instrumentation based profiling by
adding indirect call target profiling support. Our goal is to add
instrumentation around indirect call sites, so that we may track the
frequently taken target addresses and their call frequencies.

The acquired data has uses in optimization of indirect function call
heavy applications. Our initial findings show that using the profile data
in optimizations would help improve the performance of some of the spec
benchmarks notably.

Can you quantify "notably?" Also, do you profile on one set of inputs and then test the optimization on another set of inputs (e.g., the test and train runs)?

  We have a proof of concept implementation, which we
plan to put it up for review. However, I’d like to inquire prior if there
are any plans or ongoing work done in the community to enable indirect
call target profiling support or not. Please inform if cfe-dev is a better
candidate for posting PGO related emails.

Interesting. I did not think SPEC had many programs with a lot of indirect function calls.

It would be interesting to see what your optimization would do on an operating system kernel like FreeBSD or Linux. The VFS (file system) layer uses function pointers a lot, but I'm not sure if it's the dominant overhead.

Have you tried on C++ programs? They should be making heavy use of indirect function calls as well.

If you make your software public, please let me know. Adapting your work for kernel execution and trying it out on a kernel might be a nice project for one of our students.

Regards,

John Criswell

Hi All,

We've been working on enhancing LLVM's instrumentation based profiling
by
adding indirect call target profiling support. Our goal is to add
instrumentation around indirect call sites, so that we may track the
frequently taken target addresses and their call frequencies.

The acquired data has uses in optimization of indirect function call
heavy applications. Our initial findings show that using the profile
data
in optimizations would help improve the performance of some of the spec
benchmarks notably.

Can you quantify "notably?" Also, do you profile on one set of inputs
and then test the optimization on another set of inputs (e.g., the test
and train runs)?

I can't give numbers, but we do collect data from train runs.

  We have a proof of concept implementation, which we
plan to put it up for review. However, I’d like to inquire prior if
there
are any plans or ongoing work done in the community to enable indirect
call target profiling support or not. Please inform if cfe-dev is a
better
candidate for posting PGO related emails.

Interesting. I did not think SPEC had many programs with a lot of
indirect function calls.

Spec does have programs such as gcc, vortex and others which use indirect
calls. I'm planning to have an RFC soon on the feature. I'll follow it w/
the patch for the profiler changes for clang, llvm and compiler-rt.

Hi All,

We've been working on enhancing LLVM's instrumentation based profiling by
adding indirect call target profiling support. Our goal is to add
instrumentation around indirect call sites, so that we may track the
frequently taken target addresses and their call frequencies.

Just to make sure I understand what're you describing, you're doing value profiling specifically for the target address of an indirect call through a function pointer right?

The acquired data has uses in optimization of indirect function call
heavy applications. Our initial findings show that using the profile data
in optimizations would help improve the performance of some of the spec
benchmarks notably. We have a proof of concept implementation, which we
plan to put it up for review. However, I’d like to inquire prior if there
are any plans or ongoing work done in the community to enable indirect
call target profiling support or not. Please inform if cfe-dev is a better
candidate for posting PGO related emails.

I'll be interested in seeing your work. I'm interested in techniques for guarded devirtualization, but the profiling infrastructure should be fairly common. If we could arrange the instrumentation in such a way to enable both use cases, we could share infrastructure.

Philip

>> Hi All,
>>
>> We've been working on enhancing LLVM's instrumentation based profiling
>> by
>> adding indirect call target profiling support. Our goal is to add
>> instrumentation around indirect call sites, so that we may track the
>> frequently taken target addresses and their call frequencies.
>>
>> The acquired data has uses in optimization of indirect function call
>> heavy applications. Our initial findings show that using the profile
>> data
>> in optimizations would help improve the performance of some of the spec
>> benchmarks notably.
>
> Can you quantify "notably?" Also, do you profile on one set of inputs
> and then test the optimization on another set of inputs (e.g., the test
> and train runs)?

I can't give numbers, but we do collect data from train runs.

>> We have a proof of concept implementation, which we
>> plan to put it up for review. However, I’d like to inquire prior if
>> there
>> are any plans or ongoing work done in the community to enable indirect
>> call target profiling support or not. Please inform if cfe-dev is a
>> better
>> candidate for posting PGO related emails.
>
> Interesting. I did not think SPEC had many programs with a lot of
> indirect function calls.

Spec does have programs such as gcc, vortex and others which use indirect
calls. I'm planning to have an RFC soon on the feature. I'll follow it w/
the patch for the profiler changes for clang, llvm and compiler-rt.

IIRC, gap is another C program benefit from this.

David

povray in spec cpu 2006 gains if you do indirect function call promotion followed by inlining AFAIK. However, this may require PGO to specialize the top few calls.

povray probably needs both PGO and LTO.

David

Hi All,

We've been working on enhancing LLVM's instrumentation based profiling
by
adding indirect call target profiling support. Our goal is to add
instrumentation around indirect call sites, so that we may track the
frequently taken target addresses and their call frequencies.

Just to make sure I understand what're you describing, you're doing
value profiling specifically for the target address of an indirect call
through a function pointer right?

We're recording the called values at the indirect call sites. This in
essence can be considered as a type of value profiling.

The acquired data has uses in optimization of indirect function call
heavy applications. Our initial findings show that using the profile
data
in optimizations would help improve the performance of some of the spec
benchmarks notably. We have a proof of concept implementation, which we
plan to put it up for review. However, I’d like to inquire prior if
there
are any plans or ongoing work done in the community to enable indirect
call target profiling support or not. Please inform if cfe-dev is a
better
candidate for posting PGO related emails.

I'll be interested in seeing your work. I'm interested in techniques
for guarded devirtualization, but the profiling infrastructure should be
fairly common. If we could arrange the instrumentation in such a way to
enable both use cases, we could share infrastructure.

Thanks, we've recently posted an RFC on the design. The design is quite
straightforward and simple but the gains will be seen when the data is
used in optimizations.

A note on the profile. The sample pgo profiler tracks indirect function calls via sampling at the call sites. Currently this data is not being used in optimizations, but it's captured in the sample profiles (lib/Transforms/Scalar/SampleProfile.cpp).

Diego.

We also have a custom profiling mechanism which collects similar data. It seems like moving forward with a common metadata format to exploit this would be interesting.

(I'm moving the rest of my comments to the proposal thread; I wanted this one in context.)

Philip

Absolutely. Currently, it's represented by class FunctionSamples (include/llvm/ProfileData/SampleProf.h). I'm not particularly attached to its encoding, so I'm open to any common representation we can use for other profile sources.

Diego.