Static Profiling - GSoC 2009

Hello all,
  I would like to participate in this year's Google Summer of Code and
I am sending you a short description of my proposal. I have written
the formal proposal already and if someone is interested I can send
him the pdf.

  One of the open projects in the LLVM list is to enhance LLVM with
static profiling capabilities. LLVM already provides a unified
structure for writing pro file-guided transformations that utilizes
information harvested by a dynamic profiler. However, this framework
does not yet contain static pro ling capabilities. I think that static
profiling would be a valuable tool to many of the optimizations that
are already part of the LLVM framework, allowing them to focus on the
heavily executed program parts that are critical to performance.

  The proposal is to implement the static branch predictor described
by Wu et al. (1994) as a LLVM Function Pass. This pass will associate
to each path in the control flow graph of a program encoded in LLVM
intermediate representation a real number between zero and one that
denotes the probability that the path is taken during the program
execution. In order to determine the probability that a branch (br)
instruction is taken during execution, we will use a collection of
heuristics. Examples of heuristics include:
  - Edges that point to the head of a loop are taken with 95% of probability.
  - A comparison of a pointer with NULL fails with 80% of probability.
  - A comparison of an integer for less than zero will fail with 75%
of probability.
  A substantial part of the work will be to determine all the
heuristics that apply to the LLVM intermediate representation, and to
tune the probabilities associated to each comparison. Once the pass is
ready and working, ideally I would like to modify one of the analysis
that already exist in LLVM to use the profiling information. I would
be happy to hear which of the LLVM analysis you guys think is the
nicest candidate to be improved with static profiling.

  As for my background, I was recently accepted in the mastership
program of the Federal University of Minas Gerais (UFMG), Brazil. My
major research topic will be on code optimizations, mainly using LLVM
as an auxiliary framework for testing and debugging. I am part of a
team of four students currently working with code optimizations with
LLVM. I have been programming in C and C++ for more than five years
and I have strong compiler background, as this is my current field of
research. This project will be useful to my master's research and I
believe it will be also useful to the LLVM community.

Reference:
Youfeng Wu and James R. Larus. Static branch frequency and program
profile analysis. In MICRO 27: Proceedings of the 27th annual
international symposium on Microarchitecture. IEEE, 1994.

Hello all,
I would like to participate in this year's Google Summer of Code and
I am sending you a short description of my proposal. I have written
the formal proposal already and if someone is interested I can send
him the pdf.

This is a very interesting proposal, I encourage you to apply!

One of the open projects in the LLVM list is to enhance LLVM with
static profiling capabilities. LLVM already provides a unified
structure for writing pro file-guided transformations that utilizes
information harvested by a dynamic profiler. However, this framework
does not yet contain static pro ling capabilities. I think that static
profiling would be a valuable tool to many of the optimizations that
are already part of the LLVM framework, allowing them to focus on the
heavily executed program parts that are critical to performance.

Yes, the profile system in general hasn't gotten much love lately.

The proposal is to implement the static branch predictor described
by Wu et al. (1994) as a LLVM Function Pass. This pass will associate
to each path in the control flow graph of a program encoded in LLVM
intermediate representation a real number between zero and one that
denotes the probability that the path is taken during the program
execution. In order to determine the probability that a branch (br)
instruction is taken during execution, we will use a collection of
heuristics. Examples of heuristics include:
- Edges that point to the head of a loop are taken with 95% of probability.
- A comparison of a pointer with NULL fails with 80% of probability.
- A comparison of an integer for less than zero will fail with 75%
of probability.

Also: EH destination blocks should be predicted as almost never taken (1%?)

Another idea is that we could have the front-end lower __builtin_expect to drop an intrinsic in the true/false branches of "expected" branch that indicates hi/low probabilities, this could also use that info.

A substantial part of the work will be to determine all the
heuristics that apply to the LLVM intermediate representation, and to
tune the probabilities associated to each comparison. Once the pass is
ready and working, ideally I would like to modify one of the analysis
that already exist in LLVM to use the profiling information. I would
be happy to hear which of the LLVM analysis you guys think is the
nicest candidate to be improved with static profiling.

I think that it would be *very* important to have a client for this. Without that, it would be very difficult to show the value of any tuning you do.

The first step is probably a block layout pass. We already have a very simple one that might "just work".

The second step is probably the register allocator's spill weight heuristic. Right now I think it is something simple like 8^loop-depth plus information about # uses/defs. With profile info, we could do a better job at estimating the cost.

I'm not sure of another good client for this information, but this isn't something I have thought about deeply.

-Chris

hi,

The proposal is to implement the static branch predictor described
by Wu et al. (1994) as a LLVM Function Pass.

I'm advising a talented master student (Andreas Neustifter) at the Vienna University of Technology who's already doing the same thing. The rough goal of his work is to
a) cleanup the internal datastructures (currently it's basically a global map)
b) implement the same algorithm you're proposing for static branch prediction
c) use the algorithm from b) in combination with the spanning tree algorithm proposed in [Ball94] to reduce the overhead for precise instrumentation
d) utilize profile information in various optimization passes (see below)

he's already posted a rough plan for his endeavors to this mailing list before:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/020396.html

I would
be happy to hear which of the LLVM analysis you guys think is the
nicest candidate to be improved with static profiling.

There are certainly some opportunities for "high-level" optimizations, e.g., inlining, code layout or unrolling. However, we've been more interested in backend optimizations such as spilling and scheduling. I have a working patch that maintains profile information during the various codegen phases. Andreas will clean this up and contribute it together with his work if there is interest.

There's nothing wrong with two people working on similar things, but I highly suggest the two of you get together in order to avoid duplicated efforts. There is certainly plenty of work left for both of you :slight_smile: Andreas already started with the implementation, so you wouldn't have to start from scratch.