How to get the loop hotness data in a suite ?

Hi everybody,

I’m trying to get loop hotness data across a suite (e.g. the llvm test-suite). Ideally,
this would be a list that for each loop would list how many times it was entered and what
was its iteration count (at least the latter). The closest thing I could come up with is:

  • clang -fprofile-instr-generate (without opts) to get a .profraw
  • Get the .profdata
  • Give that back to clang with -fprofile-instr-use and generate .ll
  • I note that here we get “branch_weights” stats, so if a branch is a back-edge,
    it basically gives us the iteration count. For example, check the bottom of this file: https://pastebin.com/ZnQqJdTN which was created with the procedure above.
  • Then, create a custom pass that goes through every loop and gathers this “branch_weights” data.

I feel like this is both not very accurate and overly complicated for something that I
guess a lot of people probably have needed to gather in the past. Is there an easier
solution?

Thanks,
Stefanos

Hi everybody,

I’m trying to get loop hotness data across a suite (e.g. the llvm test-suite). Ideally,
this would be a list that for each loop would list how many times it was entered and what
was its iteration count (at least the latter). The closest thing I could come up with is:

  • clang -fprofile-instr-generate (without opts) to get a .profraw
  • Get the .profdata
  • Give that back to clang with -fprofile-instr-use and generate .ll
  • I note that here we get “branch_weights” stats, so if a branch is a back-edge,
    it basically gives us the iteration count. For example, check the bottom of this file: https://pastebin.com/ZnQqJdTN which was created with the procedure above.
  • Then, create a custom pass that goes through every loop and gathers this “branch_weights” data.

This seems to me like a sensible approach.

Hi everybody,

I’m trying to get loop hotness data across a suite (e.g. the llvm test-suite). Ideally,
this would be a list that for each loop would list how many times it was entered and what
was its iteration count (at least the latter). The closest thing I could come up with is:

  • clang -fprofile-instr-generate (without opts) to get a .profraw
  • Get the .profdata
  • Give that back to clang with -fprofile-instr-use and generate .ll
  • I note that here we get “branch_weights” stats, so if a branch is a back-edge,
    it basically gives us the iteration count. For example, check the bottom of this file: https://pastebin.com/ZnQqJdTN which was created with the procedure above.
  • Then, create a custom pass that goes through every loop and gathers this “branch_weights” data.

I second on this approach

I feel like this is both not very accurate and overly complicated for something that I
guess a lot of people probably have needed to gather in the past. Is there an easier
solution?

I don’t think PGO-based approach is very cumbersome. And it’s a lot more accurate than pure static approach. In case you don’t know, LLVM Test Suite has native support to build different stages of PGO program by toggling some cmake variable.
If I remember correctly, here are the steps:

  1. Set cmake variable TEST_SUITE_PROFILE_GENERATE=ON, TEST_SUITE_IR_PGO=ON
  2. ninja all + llvm-lit -sv . to build and collect profile data
  3. Modify variables in CMakeCache: TEST_SUITE_PROFILE_GENERATE=OFF, TEST_SUITE_PROFILE_USE=ON
  4. ninja all

Thanks to both for your answers!

@Min-Yih Hsu

Thanks for your suggestion, TBH though, the cumbersome part is not creating the .ll files with metadata from PGO. It is interpreting those statistics, which is the last step
I described i.e. creating the pass. I mean it’s not that time-consuming, but I would assume that a lot of people would have needed loop hotness statistics in the past and
there would be a more automatic way.

Best,
Stefanos

Στις Πέμ, 1 Οκτ 2020 στις 8:11 μ.μ., ο/η Min-Yih Hsu <minyihh@uci.edu> έγραψε: