RFC - A tool to convert profiles from external profilers

A few weeks ago, I announced the availability of a conversion tool that converts Linux Perf sample profiles to LLVM’s sample profiler (https://github.com/google/autofdo).

I have now ported this tool to the LLVM tree, so it can be made available as part of LLVM. I’ve got a working version, but I still need to massage the code to use LLVM’s own libraries (logging, flags, etc) and adapt it to LLVM’s coding guidelines.

I expect to have an initial patch ready in a few days. In the meantime, I would like to open it up for some bike shedding on how this tool will integrate with LLVM. I cannot guarantee that I’ll agree to address all feedback myself, but I do want to make sure that the major issues and direction are addressed.

The tool receives two inputs: a profile file and an ELF executable (built with -gmlt). It produces as output a profile in the format expected by -fprofile-sample-use.

The requirements on the input is that we must be able to map counts to specific source file locations. So, the profiling source must be some kind of sample-based instruction execution counter. Anything that keeps track of how frequently a specific instruction has been executed.

Using the executable’s line table information, the tool maps instruction locations back to source locations.

Currently, the native profile reader for Linux Perf and the writer for LLVM’s profile are part of the same tree. They both reside in llvm/tools/llvm-prof-converter.

I would like to support more than Linux Perf, eventually. So, I’m thinking that I want to move the LLVM profile writer to llvm/lib/ProfileData and only have the various readers under llvm/tools/llvm-prof-converter.

I don’t think I will be working on supporting anything other than Linux Perf for now. But if anyone is interested in supporting profilers in other platforms, please let me know. I want to make sure the implementation doesn’t tie itself too much to Linux Perf.

Thanks. Diego.

Diego Novillo <dnovillo@google.com> writes:

A few weeks ago, I announced the availability of a conversion tool that
converts Linux Perf sample profiles to LLVM's sample profiler (https://
github.com/google/autofdo).

I have now ported this tool to the LLVM tree, so it can be made available as
part of LLVM. I've got a working version, but I still need to massage the code
to use LLVM's own libraries (logging, flags, etc) and adapt it to LLVM's
coding guidelines.

I expect to have an initial patch ready in a few days. In the meantime, I
would like to open it up for some bike shedding on how this tool will
integrate with LLVM. I cannot guarantee that I'll agree to address all
feedback myself, but I do want to make sure that the major issues and
direction are addressed.

Sounds great! Thanks for working on this.

The tool receives two inputs: a profile file and an ELF executable (built with
-gmlt). It produces as output a profile in the format expected by
-fprofile-sample-use.

Do you have a plan in mind on how this will decide between input profile
file formats once there are options other than perf? The two obvious
approaches are autodetection/a format flag or having a different
profile-converter tool per format. I think I'm in favour of the former.

The requirements on the input is that we must be able to map counts to
specific source file locations. So, the profiling source must be some kind of
sample-based instruction execution counter. Anything that keeps track of how
frequently a specific instruction has been executed.

Using the executable's line table information, the tool maps instruction
locations back to source locations.

Currently, the native profile reader for Linux Perf and the writer for LLVM's
profile are part of the same tree. They both reside in llvm/tools/
llvm-prof-converter.

I would like to support more than Linux Perf, eventually. So, I'm thinking
that I want to move the LLVM profile writer to llvm/lib/ProfileData and only
have the various readers under llvm/tools/llvm-prof-converter.

This sounds like the right direction to me. Arguably, the readers could
go into lib/ProfileData as well, leaving only the tool logic under
llvm/tools. I think either split would be fine.

A few weeks ago, I announced the availability of a conversion tool that converts Linux Perf sample profiles to LLVM's sample profiler (GitHub - google/autofdo: AutoFDO).

This seems like a potentially useful tool. I support having this in tree.

I would like to support more than Linux Perf, eventually. So, I'm thinking that I want to move the LLVM profile writer to llvm/lib/ProfileData and only have the various readers under llvm/tools/llvm-prof-converter.

Given we now have multiple proposed users, having these libraries in some shared location seems reasonable. I would want both the readers & writers in the same location.

Glancing at the current code in ProfileData, I do not see any uses in LLVM itself. There is one checked-in use in the tools directory. I'm hesitant about putting support code for tools which is not otherwise used in the main llvm source tree. Is there a better place we could put this? (Or am I simply being too conservative?)

We do need to make a clear distinction between externally defined formats (with test cases! and spec references!) and internally defined formats which can be changed.

Philip

A few weeks ago, I announced the availability of a conversion tool that converts Linux Perf sample profiles to LLVM’s sample profiler (https://github.com/google/autofdo).

I have now ported this tool to the LLVM tree, so it can be made available as part of LLVM. I’ve got a working version, but I still need to massage the code to use LLVM’s own libraries (logging, flags, etc) and adapt it to LLVM’s coding guidelines.

I expect to have an initial patch ready in a few days. In the meantime, I would like to open it up for some bike shedding on how this tool will integrate with LLVM. I cannot guarantee that I’ll agree to address all feedback myself, but I do want to make sure that the major issues and direction are addressed.

The tool receives two inputs: a profile file and an ELF executable (built with -gmlt). It produces as output a profile in the format expected by -fprofile-sample-use.

The requirements on the input is that we must be able to map counts to specific source file locations. So, the profiling source must be some kind of sample-based instruction execution counter. Anything that keeps track of how frequently a specific instruction has been executed.

Using the executable’s line table information, the tool maps instruction locations back to source locations.

Currently, the native profile reader for Linux Perf and the writer for LLVM’s profile are part of the same tree. They both reside in llvm/tools/llvm-prof-converter.

What do you think about invoking this via the existing “llvm-profdata” tool? It currently supports two commands: “show” and “merge”. We could add a new “convert” command.

Do you have a plan in mind on how this will decide between input profile
file formats once there are options other than perf? The two obvious
approaches are autodetection/a format flag or having a different
profile-converter tool per format. I think I'm in favour of the former.

Initially a format flag (--type={perf,text,...}). It's easier to
implement than autodetection, which can always be added later.

This sounds like the right direction to me. Arguably, the readers could
go into lib/ProfileData as well, leaving only the tool logic under
llvm/tools. I think either split would be fine.

Yeah. I am splitting the tool, so the Perf reader goes into its own
library in llvm/lib/PerfReader (or somesuch) and the driver goes in
llvm/tools.

I quite liked Bob's idea of adding functionality to the existing
llvm-profdata driver. Seems natural to add a --convert flag to it.

Diego.

I had not thought of that, but I think it makes perfect sense. I'll
drop my current driver and move the logic into llvm-profdata. Thanks
for the suggestion.

Diego.