I'm a GSoC student working on profiling support (mentor CC'ed). I'm no
stranger to the issues with the current system: my original proposal was
written without knowledge of the limitations. This is why this list
hasn't heard much from me yet.
It would be better to actually email the list with the problems you are
hitting sooner. We would have directed you to the BranchProbabilityInfo and
BlockFrequencyInfo infrastructure that is now actively in use by several
optimization passes in LLVM.
Can you start a thread with what your specific goals are? The only email I
have from you is from May 1st, which was your project outline for the GSoC
application, to which Evan responded, but to which nothing else ever
I think if any of us knew that you were actively working on the existing
ProfileInfo and profile loading infrastructure we would have jumped in
quickly to redirect things. I'll provide a bit of feedback below, but we
should really hold the main part of this discussion on a separate thread,
as ultimately *how* you can use the BPI and BFI analyses and the metadata
they are predicated on is independent from whether we remove the existing
code... The important thing is that you can use them, and that the existing
code won't help much (if at all).
I would like to continue working on profiling support but I'm not
attached to ProfileInfo and wouldn't be distraught if it gets removed.
I'd rather work on a useful solution than a dead one and nothing I've
done so far couldn't be ported to a different interface. I know some
people are definitely interested in profiling support.
I am one of them. ;] I've been working on the BPI- and BFI-based
optimizations in LLVM.
My reason for doing this GSoC project was to gain experience with LLVM
and that aim doesn't disappear after one summer's worth of work. My
personal plan was to keep on working on profiling support beyond the end
date, long-term (though not full time). I do compiler research, but not
currently with LLVM -- so this is a genuine desire and plan.
The last thing I want is for GSoC work to come to nothing. I've been a GSoC
student, and so it's important to me that we give you useful things to work
on that the overarching LLVM community benefits from.
Perhaps the best way to proceed is to remove ProfileInfo etc and then I
can work on re-adding/re-writing it with support for BranchProbability
and BlockFrequency. Then I can maintain what I add. (As an unknown
figure offering to maintain the existing code seems a bit hollow.)
Naturally, any changes to my GSoC plans need discussion with my mentor.
Yes, I think you and your mentor need to discuss this and to start engaging
with the LLVM community through the mailing list about the design of the
summer-of-code work, how it should be structured, etc. I think that the
community at large is extremely interested in improved PGO support in LLVM,
and so we'd be really interested in giving feedback and direction advice...
I'd like to hear an explicit OK from you and your mentor before I remove
anything, as I don't want to get in the way of any immediate progress or
work you're doing as part of GSoC.
A few specifics below.
> Hello folks,
> I'd like to remove all of the old and defunct profile info passes from
> LLVM. These have been almost entirely supplanted by the
> BranchProbability and BlockFrequency systems, which are actually on by
> default, and in use in optimization passes.
> The old system is not on, and hasn't been touched in years except to do
> minor build fixes and updates.
> As far as I'm aware, the only thing the old system supported which the
> new one does not is loading profile data. However, it didn't support
> doing anything useful with that data once loaded,
There is also the llvm-prof tool, but yes: there is only one
optimization pass that uses profiling data and it is not on by default.
The hindrance to using ProfileInfo in other passes is that almost no
passes preserve it and it can not be recalculated (just estimated by
As far as I can tell BranchProbability and BlockFrequency would also be
invalidated by any CFG altering passes. Preserving ProfileInfo was my
primary long-term (post-GSoC) task, preserving BranchProbability and
BlockFrequency instead should be no harder.
Details for another thread, but essentially, BranchProbabilityInfo is going
to "recompute" the probabilities, but only insofar as they stem from
heuristics that can be computed. If you look at the pass, there is also
support for loading branch weight metadata and using that to form
probability estimates. This metadata is attached to the IR and is the
primary mechanism for preserving information.
The overarching expected strategy is that a profile-loading pass would load
a profile and convert it to metadata annotations on the IR itself. Then
passes would be taught to preserve this metadata wherever possible, and
BranchProbabilityInfo will compute probabilities based on the metadata,
falling back to static heuristics only when the metadata is absent.
The missing pieces are:
1) Ensuring that passes such as SimplifyCFG preserve as much branch weight
metadata as possible. This is already an issue for metadata that comes from
__builtin_expect() source code annotations, so it is something you can
write tests for today and observe problems.
2) A pass to load profile data and attach metadata to branch instructions
based on it.
3) Ensuring that #2 works with the profiles produced by an instrumented
It's important to note that #2 is easy. It's #1 and #3 that are really
hard. When I have talked to others about #3, there has been the feeling
that we would probably want to write a custom instrumentation pass that
would add instrumentation to the right LLVM IR formations, in order to make
sure that when it is read back in, the profile data matches up with the IR
correctly, and is available early enough in the optimization passes to be