FYI: Planning to remove ProfileInfo and related passes from LLVM

Hello folks,

I’d like to remove all of the old and defunct profile info passes from LLVM. These have been almost entirely supplanted by the BranchProbability and BlockFrequency systems, which are actually on by default, and in use in optimization passes.

The old system is not on, and hasn’t been touched in years except to do minor build fixes and updates.

As far as I’m aware, the only thing the old system supported which the new one does not is loading profile data. However, it didn’t support doing anything useful with that data once loaded, and has seen essentially zero testing on mainline over the past two years, so I think its time to let go of this pile of code.

I’m planning to nuke it right away and I can resurrect it if someone actually steps forward with a use case (and an offer to maintain and support it, preferably integrating it w/ the above two systems).
-Chandler

Sounds fine to me, please remove profile.pl if it is still around and update any docs that reference it. Thanks!

-Chris

+1 for ripping out ProfileInfo, it is obsoleted by BranchProbababilty/Frequency. Even the profile info loader can go away, it would need a rewrite for the new infrastructure and isn't a big piece of code.

I'm not sure about the runtime parts in libprofile and the edge profiling instrumentation. Those could be still useful if someone wires them up to the newer infrastructure.

- Ben

Hi Chandler,

I'm a GSoC student working on profiling support (mentor CC'ed). I'm no stranger to the issues with the current system: my original proposal was written without knowledge of the limitations. This is why this list hasn't heard much from me yet.

I would like to continue working on profiling support but I'm not attached to ProfileInfo and wouldn't be distraught if it gets removed. I'd rather work on a useful solution than a dead one and nothing I've done so far couldn't be ported to a different interface. I know some people are definitely interested in profiling support.

My reason for doing this GSoC project was to gain experience with LLVM and that aim doesn't disappear after one summer's worth of work. My personal plan was to keep on working on profiling support beyond the end date, long-term (though not full time). I do compiler research, but not currently with LLVM -- so this is a genuine desire and plan.

Perhaps the best way to proceed is to remove ProfileInfo etc and then I can work on re-adding/re-writing it with support for BranchProbability and BlockFrequency. Then I can maintain what I add. (As an unknown figure offering to maintain the existing code seems a bit hollow.)

Naturally, any changes to my GSoC plans need discussion with my mentor.

A few specifics below.

Hello folks,

I'd like to remove all of the old and defunct profile info passes from
LLVM. These have been almost entirely supplanted by the
BranchProbability and BlockFrequency systems, which are actually on by
default, and in use in optimization passes.

The old system is not on, and hasn't been touched in years except to do
minor build fixes and updates.

As far as I'm aware, the only thing the old system supported which the
new one does not is loading profile data. However, it didn't support
doing anything useful with that data once loaded,

There is also the llvm-prof tool, but yes: there is only one optimization pass that uses profiling data and it is not on by default. The hindrance to using ProfileInfo in other passes is that almost no passes preserve it and it can not be recalculated (just estimated by ProfileEstimate).

As far as I can tell BranchProbability and BlockFrequency would also be invalidated by any CFG altering passes. Preserving ProfileInfo was my primary long-term (post-GSoC) task, preserving BranchProbability and BlockFrequency instead should be no harder.

and has seen
essentially zero testing on mainline over the past two years, so I think
its time to let go of this pile of code.

I've written a TEST.profile.Makefile for test-suite to test profiling. As long as the instrumentation and loading passes are carefully placed the existing profiling support works (i.e. profiling data can be generated and then loaded).

I'm planning to nuke it right away and I can resurrect it if someone
actually steps forward with a use case (and an offer to maintain and
support it, preferably integrating it w/ the above two systems).
-Chandler

Regards,
Alastair.

Hi Chandler,

I'm a GSoC student working on profiling support (mentor CC'ed). I'm no
stranger to the issues with the current system: my original proposal was
written without knowledge of the limitations. This is why this list
hasn't heard much from me yet.

It would be better to actually email the list with the problems you are
hitting sooner. We would have directed you to the BranchProbabilityInfo and
BlockFrequencyInfo infrastructure that is now actively in use by several
optimization passes in LLVM.

Can you start a thread with what your specific goals are? The only email I
have from you is from May 1st, which was your project outline for the GSoC
application, to which Evan responded, but to which nothing else ever
materialized.

I think if any of us knew that you were actively working on the existing
ProfileInfo and profile loading infrastructure we would have jumped in
quickly to redirect things. I'll provide a bit of feedback below, but we
should really hold the main part of this discussion on a separate thread,
as ultimately *how* you can use the BPI and BFI analyses and the metadata
they are predicated on is independent from whether we remove the existing
code... The important thing is that you can use them, and that the existing
code won't help much (if at all).

I would like to continue working on profiling support but I'm not

attached to ProfileInfo and wouldn't be distraught if it gets removed.
I'd rather work on a useful solution than a dead one and nothing I've
done so far couldn't be ported to a different interface. I know some
people are definitely interested in profiling support.

I am one of them. ;] I've been working on the BPI- and BFI-based
optimizations in LLVM.

My reason for doing this GSoC project was to gain experience with LLVM

and that aim doesn't disappear after one summer's worth of work. My
personal plan was to keep on working on profiling support beyond the end
date, long-term (though not full time). I do compiler research, but not
currently with LLVM -- so this is a genuine desire and plan.

The last thing I want is for GSoC work to come to nothing. I've been a GSoC
student, and so it's important to me that we give you useful things to work
on that the overarching LLVM community benefits from.

Perhaps the best way to proceed is to remove ProfileInfo etc and then I
can work on re-adding/re-writing it with support for BranchProbability
and BlockFrequency. Then I can maintain what I add. (As an unknown
figure offering to maintain the existing code seems a bit hollow.)

Naturally, any changes to my GSoC plans need discussion with my mentor.

Yes, I think you and your mentor need to discuss this and to start engaging
with the LLVM community through the mailing list about the design of the
summer-of-code work, how it should be structured, etc. I think that the
community at large is extremely interested in improved PGO support in LLVM,
and so we'd be really interested in giving feedback and direction advice...

I'd like to hear an explicit OK from you and your mentor before I remove
anything, as I don't want to get in the way of any immediate progress or
work you're doing as part of GSoC.

A few specifics below.

> Hello folks,
>
> I'd like to remove all of the old and defunct profile info passes from
> LLVM. These have been almost entirely supplanted by the
> BranchProbability and BlockFrequency systems, which are actually on by
> default, and in use in optimization passes.
>
> The old system is not on, and hasn't been touched in years except to do
> minor build fixes and updates.
>
> As far as I'm aware, the only thing the old system supported which the
> new one does not is loading profile data. However, it didn't support
> doing anything useful with that data once loaded,

There is also the llvm-prof tool, but yes: there is only one
optimization pass that uses profiling data and it is not on by default.
  The hindrance to using ProfileInfo in other passes is that almost no
passes preserve it and it can not be recalculated (just estimated by
ProfileEstimate).

As far as I can tell BranchProbability and BlockFrequency would also be
invalidated by any CFG altering passes. Preserving ProfileInfo was my
primary long-term (post-GSoC) task, preserving BranchProbability and
BlockFrequency instead should be no harder.

Details for another thread, but essentially, BranchProbabilityInfo is going
to "recompute" the probabilities, but only insofar as they stem from
heuristics that can be computed. If you look at the pass, there is also
support for loading branch weight metadata and using that to form
probability estimates. This metadata is attached to the IR and is the
primary mechanism for preserving information.

The overarching expected strategy is that a profile-loading pass would load
a profile and convert it to metadata annotations on the IR itself. Then
passes would be taught to preserve this metadata wherever possible, and
BranchProbabilityInfo will compute probabilities based on the metadata,
falling back to static heuristics only when the metadata is absent.

The missing pieces are:
1) Ensuring that passes such as SimplifyCFG preserve as much branch weight
metadata as possible. This is already an issue for metadata that comes from
__builtin_expect() source code annotations, so it is something you can
write tests for today and observe problems.
2) A pass to load profile data and attach metadata to branch instructions
based on it.
3) Ensuring that #2 works with the profiles produced by an instrumented
binary.

It's important to note that #2 is easy. It's #1 and #3 that are really
hard. When I have talked to others about #3, there has been the feeling
that we would probably want to write a custom instrumentation pass that
would add instrumentation to the right LLVM IR formations, in order to make
sure that when it is read back in, the profile data matches up with the IR
correctly, and is available early enough in the optimization passes to be
used.

Hi Chandler and Alastair,

I have been using the Profile.pl and the related passes and optimizations for about 4 years now. With every new release lately, the support for the profile scripts and their framework seemed to be downgrading. Hence, I used my own tiny one line fixes to keep them working. I offered to send these small patches to keep these scripts working, to the LLVM dev so that others can use it if they want. I think my email got lost that time, in the sea of emails we see on the dev mail list.

I must confess that I was not aware of the BPI and BFI infrastructures. The breaking of the profiling infrastructure always baffled me. Now it makes sense, since it has been superseded by these new frameworks.

Anyway, if you guys decide to keep the old profiling framework, it would be good. As Alastair has mentioned, the llvm-prof helps in a way for instrumenting the code with the profile data. Maybe this is also part of the BPI/BFI in which case it’s great.

Cheers,

Alok

PhD Candidate, NTU Singapore.

Can you start a thread with what your specific goals are?

This is to follow.

I'd like to hear an explicit OK from you and your mentor before I remove
anything, as I don't want to get in the way of any immediate progress or
work you're doing as part of GSoC.

This is an explicit OK for you to remove ProfileInfo and related code. My mentor and I agree that it is better to work on code that will be sticking around.

The existing EdgeProfiling.cpp and the runtime code may be worth saving (these require ProfilingUtils.* and ProfileInfoTypes.h respectively).

Without a profile loader they aren't useful (even llvm-prof is based on ProfileInfoLoader), but as they count edge frequency it intuitively seems like the profiling data they generate could be used by a rewritten BranchProbability profile loader.

One note, if I plan on submitting patches to restore rewritten parts is destroying `svn blame` history a problem?

The missing pieces are:
1) Ensuring that passes such as SimplifyCFG preserve as much branch
weight metadata as possible. This is already an issue for metadata that
comes from __builtin_expect() source code annotations, so it is
something you can write tests for today and observe problems.
2) A pass to load profile data and attach metadata to branch
instructions based on it.
3) Ensuring that #2 works with the profiles produced by an instrumented
binary.

It's important to note that #2 is easy. It's #1 and #3 that are really
hard. When I have talked to others about #3, there has been the feeling
that we would probably want to write a custom instrumentation pass that
would add instrumentation to the right LLVM IR formations, in order to
make sure that when it is read back in, the profile data matches up with
the IR correctly, and is available early enough in the optimization
passes to be used.

I understand that #1 is hard, but #3 isn't so clear to me. The current ProfileInfo based -insert-edge-profiling and -profile-loader passes can already do this. The instrumentation and loading passes have to occur at exactly the same point of their respective pass pipelines, but that is unavoidable.

Regards,
Alastair.

Hi Alok,

I have been using the Profile.pl and the related passes and
optimizations for about 4 years now. With every new release lately, the
support for the profile scripts and their framework seemed to be
downgrading. Hence, I used my own tiny one line fixes to keep them
working. I offered to send these small patches to keep these scripts
working, to the LLVM dev so that others can use it if they want. I think
my email got lost that time, in the sea of emails we see on the dev mail
list.

profile.pl has seen a non-cleanup related commit in 5 years ... It seems so simple I'm not sure I see a need for it. But clearly you use it!

I must confess that I was not aware of the BPI and BFI infrastructures.
The breaking of the profiling infrastructure always baffled me. Now it
makes sense, since it has been superseded by these new frameworks.

Anyway, if you guys decide to keep the old profiling framework, it would
be good. As Alastair has mentioned, the llvm-prof helps in a way for
instrumenting the code with the profile data. Maybe this is also part of
the BPI/BFI in which case it's great.

If a new profiler loader based on BPI gets written then llvm-prof could be rewritten to use the BFI system without too much difficulty.

I found llvm-prof useful for examining the behaviour of profiling support within LLVM, but I'm curious about what you use it for. It seems quite limited for performance analysis.

Regards,
Alastair.

Hello Alastair,

Yeah like I said, I was not aware of the new profile framework being developed. Interestingly BPI and BFI didnt turn up in any searches either. Anyway, I will take a look at them and see how they differ from the existing tools.

Profile.pl is understandably a very simple script, but it does make it easier to see some preliminary profile results and identity the hot portions of a program which are suitable for hardware acceleration. llvm-prof also helps in the same way.

Cheers,
Alok

Hi Alok,

Profile.pl is understandably a very simple script, but it does make
it easier to see some preliminary profile results and identity the
hot portions of a program which are suitable for hardware
acceleration. llvm-prof also helps in the same way.

Ok, great -- using llvm-prof for detecting hot portions is the obvious use, but I just wanted to check.

I'll keep llvm-prof in mind when writing the new loader and update llvm-prof to use it. I also don't have an issue with profile.pl so I can update it as required, it is tiny after all. I'm not sure if others want it sticking around though.

Regards,
Alastair.

FYI, thanks for all the discussion and feedback!

It sounds like folks are generally OK with the old stuff being cleared out, so I plan to go ahead and do that. Specific concerns and the strategy for them being addressed:

  • Alastair’s GSoC project is moving to the new branch probability infrastructure and specifically branch weight metadata. It sounds like nuking the profile stuff won’t really even disrupt this work, so no need to worry there.

  • The llvm-prof, profile.pl, related profiling tools may work for some, but not others. The only user is out-of-tree, and is already regularly broken by updates. My suggestion would be to not update, and/or contribute analogous tools for branch weight metadata representation. I’m happy to give advice or help out here a little if I can. As we’re already breaking these, I don’t think we should actually keep them in a half-dead zombie state. If they’re re-written, they should ideally be in C++ and better tested.

  • I’ll watch the lists and try to help out anyone impacted by this.

For the record, I’m not just trying to break stuff. i’m looking at deep changes to the pass manager, and the profiling passes don’t always use it cleanly at the moment. One fewer strange user will make the new design easier and better. The strangeness appears to have to do with legacy and lack of maintenance, nothing to do with the nature of profiling passes.

Thanks!
-Chandler