LLVM AutoFDO status

With recent bug fixes and performance tunings, AutoFDO@llvm has reached a usable state. To evaluate performance, we used O3/-fprofile-use/-fprofile-sample-use respectively to optimize clang itself, and measure its speed.

clang built with -fprofile-use is ~20% faster than clang built with O3
clang built with -fprofile-sample-use is ~10% faster than clang built with O3

AutoFDO can deliver 50% of the FDO speedup to clang. The gap is mainly due to inaccurate/lost debug info, which is used to represent the profile. I am still tuning the performance to fill in the gap.

During the meantime, we encourage you to try it out. Bug reports/fixes are always welcome. For more information about how to generate AutoFDO profile, please refer to https://github.com/google/autofdo

Cheers,
Dehao

Hi Dehao,

Do you have any specific bugs for “inaccurate/lost debug info”? I haven’t seen anything and I’m curious what you might be running into.

Thanks.

-eric

That said, this is great news! I’m ecstatic to see that the sample based FDO is doing well on llvm. :slight_smile:

-eric

Hi Dehao,

Do you have any specific bugs for "inaccurate/lost debug info"? I haven't
seen anything and I'm curious what you might be running into.

Those lost info are mostly due to optimizations (examples include code
introduced by the optimizer, such as those from strength reduction,
runtime condition check etc) -- not that the base debug info
generation has anything wrong..

David

Hi Dehao,

Do you have any specific bugs for “inaccurate/lost debug info”? I haven’t
seen anything and I’m curious what you might be running into.

Those lost info are mostly due to optimizations (examples include code
introduced by the optimizer, such as those from strength reduction,
runtime condition check etc) – not that the base debug info
generation has anything wrong…

Aha! Excellent. Both good to hear and I look forward (and don’t look forward) to those bugs :slight_smile:

-eric