Reviewer for our Path Profiling Implementation

I am a student at the University of Alberta under the
supervision of José Nelson Amaral, and I have been working on
implementing path profiling into LLVM. I have completed my project
and would like to submit it.

We are looking for a reviewer for the path profiling implementation. We
have sent previous requests to the llvmdev list but have so far been
unsuccessful.

Please see the attached patch and associated documentation.

Adam Preuss

path-profiling.patch (122 KB)

ppdoc.pdf (79.5 KB)

As far as I know, none of LLVM's standard passes make use of any profiling information. If we are going to get any value from having profiling support in LLVM, so that it is worth the effort of maintaining that code, we need to figure out how we're going to use it.

In my opinion, it is essential that LLVM passes be able to query the profile information with a standard API, and that they also be able to update the profile information to account for changes to the code (e.g., when a block is duplicated, when a loop is unrolled, etc.). We need to have one API for those things regardless of whether the profile information came from __builtin_expect, simple heuristics, static analysis, or some kind of profile feedback.

The existing support for profile information does not include such an API, which is at least part of the reason why nothing uses it. Adding path profiling would only make things worse in that it would be yet another format of profiling information. Do you have any thoughts on how to provide an API for LLVM to use and manipulate the path profiling information, in ways that would also apply to other kinds of profile info?

I have not reviewed your patch in detail, but as you can tell, I have some concerns about whether it makes sense to include this in LLVM. I've never worked on a compiler with path profiling, and the kind of profiling APIs that I can think of would not work very well with path profiling. I don't think it makes sense to include this as a standard part of LLVM unless we can come up with a way to make it useful. If you have suggestions on how to do that, it would help.

Hi Adam,

Thanks for making the effort to contribute. I have seen some valid
feedback to the effect that we can't keep something in mainline unless
is it actively used or maintained. However, I'm willing to give you a
review so it can be checked in, with no guarantee that the code won't
be removed if it breaks, or replaced by a different profiler if we
ever adopt one for active use.

Thanks for the design doc. We'll be sure to get that checked into
llvm.org/pubs. A couple of design questions: Can you compare the
overhead of LLVM's edge profiler vs path profiler, both run time
(profile time) and compile time? How do you anticipate the path
profile to be used by the optimizer?

Regarding the implementation, a couple of style problems should be addressed:

- Tabs absolutely need to be removed:
http://llvm.org/docs/CodingStandards.html#scf_spacestabs

- I see a few violation of 80 cols. Not many, but too many for me
  to list here.

If you think any of my specific suggestions below are off-base, don't
follow them, just respond with some justification. Since you've
already done rigorous verification, and I'm not overly concerned with
optimality, most of my comments are style related.

Thank you for your suggestions on the patch. If the patch is committed, I would be willing to maintain it, though
I am not sure what is all involved or how I am made aware of changes that need to be made.

The technical report https://www.cs.ualberta.ca/system/files/tech_report/2010/PreussPathProfLLVM.pdf contains
my benchmarks relating to profiling overhead in LLVM.

Over the next week I will work through the patch and make the appropriate adjustments. I have some responses to your
comments below as well. Things that I haven’t commented on make sense and I will change them.

Adam

I am a student at the University of Alberta under the

supervision of José Nelson Amaral, and I have been working on

implementing path profiling into LLVM. I have completed my project

and would like to submit it.

We are looking for a reviewer for the path profiling implementation. We

have sent previous requests to the llvmdev list but have so far been

unsuccessful.

Please see the attached patch and associated documentation.

Adam Preuss

<path-profiling.patch>

<doc.pdf>

Hi Adam,

Thanks for making the effort to contribute. I have seen some valid
feedback to the effect that we can’t keep something in mainline unless
is it actively used or maintained. However, I’m willing to give you a
review so it can be checked in, with no guarantee that the code won’t
be removed if it breaks, or replaced by a different profiler if we
ever adopt one for active use.

Thanks for the design doc. We’ll be sure to get that checked into
llvm.org/pubs. A couple of design questions: Can you compare the
overhead of LLVM’s edge profiler vs path profiler, both run time
(profile time) and compile time? How do you anticipate the path
profile to be used by the optimizer?

Regarding the implementation, a couple of style problems should be addressed:

If you think any of my specific suggestions below are off-base, don’t
follow them, just respond with some justification. Since you’ve
already done rigorous verification, and I’m not overly concerned with
optimality, most of my comments are style related.


+static cl::optstd::string
+EdgeProfileFilename(“path-profile-verifier-file”,

  • cl::init(“edgefrompath.llvmprof.out”),
  • cl::value_desc(“filename”),
  • cl::desc(“Edge profile file generated by -path-profile-verifier”));

    +// command line option for loading path profiles
    +static cl::optstd::string
    +PathProfileInfoFilename(“path-profile-loader-file”, cl::init(“llvmprof.out”),
  • cl::value_desc(“filename”),
  • cl::desc(“Path profile file loaded by -path-profile-loader”));

    +// Are we enabling early termination
    +static cl::opt ProcessEarlyTermination(“process-early-termination”,
  • cl::desc("In path profiling, insert extra instrumentation to account for "
  • “unexpected function termination.”));

    +// Should we print the dot-graphs
    +static cl::opt DotPathDag(“dot-pathdag”,
  • cl::desc(“Output the path profiling DAG for each function.”));

Please add cl::Hidden to all new flags, and give them all
a “path-profile” prefix so they aren’t scattered in the hidden help text.

+++ include/llvm/Analysis/ProfileInfoTypes.h

+#define PP_ARRAY 0
+#define PP_HASH 1

Why not an enum?

+typedef struct {

  • unsigned fnNumber; /* function number for these counters */
  • unsigned numEntries; /* number of entries stored */
    +} PathHeader;

+typedef struct {

  • unsigned pathNumber;
  • unsigned pathCounter;
    +} PathTableEntry;

Being global types, they need a PathProfiling-specific name or prefix
(there are other kinds of paths). To be proper, should they also be under:
#if defined(__cplusplus)
extern “C” {
endif

You may even want to remove the emacs hint “-- C++ --” from both
this header and Profiling.h

+++ runtime/libprofile/Profiling.h

+#include <stdint.h>
+#include <unistd.h>

Include these only in the .cpp files in which they’re needed. Include
them after any non-system headers.

+typedef int PType; /* type for storing enum ProfilingType on disk */

I realize PType is currently only included within libprofile, but the
general guideline is to make type names in the global namespace
descriptive. Such as ProfileInfoStorageType.

I’m not sure why you made it a global type. Was it for use by the
profile loader? If so, I think it would need to be in
ProfileInfoTypes.h under extern “C”.

+++ runtime/libprofile/CommonProfiling.c

+static int OutFile = -1;

I’m not sure why OutFile cannot remain a local static within
getOutFile.

  • int res;
  • res = write(outFile, &PTy, sizeof( Profile));

Some builds may warn of the unused result. If you’re not checking the
result, you probably shouldn’t retrieve it.

+++ runtime/libprofile/PathProfiling.c

  • inline uint32_t* getPathCounter(uint32_t functionNumber, uint32_t pathNumber) {
  • if( ((ftEntry_t*)ft)[functionNumber-1].array == 0)
  • ((ftEntry_t*)ft)[functionNumber-1].array =
    calloc(sizeof(pathHashTable_t), 1);

It’s not immediately obvious to me why you always cast “ft”.

I think there used to be a reason, but I don’t see a reason to any more.

Since this is a runtime lib, it may be wise to add array bounds checks.

I can, but a path number will never be larger than the array size so I don’t think it’s necessary.

In getPathCounter can you check that type == PP_HASH and size == 0
before writing to the data structure?

+++ lib/Transforms/Instrumentation/ProfilingUtils.h (working copy)

+#include “llvm/DerivedTypes.h”

Add the include only in the .cpp files that need it.

+++ include/llvm/Analysis/PathNumbering.h

+namespace llvm {

  • class BallLarusNode;

I don’t think you should indent large namespaces bodies. Same goes for
the rest of the files.

+++ lib/Analysis/PathNumbering.cpp

The constructor and accessors are trivial and can be defined inline.

  • void BallLarusDag::calculatePathNumbersFrom(BallLarusNode* node) {
  • BallLarusEdge* currEdge = *succ;
  • currEdge->setWeight(sumPaths);
  • succNode = currEdge->getTarget();
  • unsigned succPaths = succNode->getNumberPaths();
  • isReady = isReady && (succPaths != 0);

If a successor is not finished, can you early return here instead of
prematurely setting the edge weights? Better yet, keep a count
of the remaining successors, then only call calculatePathNumberFrom()
once per node after all successors are visited? You already visiting
each pred edge after finishing a node, just decrement its remaining
count at that time.

Yes, I don’t see why not.

+++ lib/Transforms/Instrumentation/PathProfiling.cpp

  • void PathProfiler::insertInstrumentationStartingAt(BLInstrumentationEdge*

Does this need to be recursive? If you simply visit the edges in dfs
order (creation order?) would that be sufficient? I don’t think you
even need to create RPO order, since phis can be lazily prepared.

I think it has to traverse in the current method, otherwise certain pointers to
path counters, phi nodes, etc will not be initialized properly in my representation
of the graph. I will it over more closely though.

At this point, I don’t think the patch can be considered for merging into mainline LLVM. However, it may be suitable as a LLVM project. Would that work?

Evan

[Cc'd to a bunch of people who have in the past expressed interest in
profiling with LLVM.]

As far as I know, none of LLVM's standard passes make use of any profiling
information. If we are going to get any value from having profiling
support in LLVM, so that it is worth the effort of maintaining that code,
we need to figure out how we're going to use it.

Right. That's a lot of grunt work, and no one person or group will probably
want to do it, especially if the results will not be very visible to others.

For this reason, our group in Vienna would like to propose a bundling of
various people's efforts in profiling. In particular, we would be willing to
host an LLVM branch that is open to profiling-related contributions such as:
- various kinds of profilers
- patches to preserve/update profiling information across existing passes
- code that improves existing passes based on profiling info
- code that communicates profiling data to the backend (to get MBB profile
  data, profile-based spill weights etc.)

The idea is to be more open to such submissions than mainline LLVM is, so
that we can experiment with and compare different approaches. Hopefully,
some parts of this would converge to a state that would then deemed
acceptable for mainline LLVM. Even if not, we would have a common codebase
in which the more boring parts (such as preserving info across simple
passes) are solved once and for all.

In my opinion, it is essential that LLVM passes be able to query the profile information with a standard API, and that they also be able to update the profile information to account for changes to the code (e.g., when a block is duplicated, when a loop is unrolled, etc.). We need to have one API for those things regardless of whether the profile information came from __builtin_expect, simple heuristics, static analysis, or some kind of profile feedback.

Absolutely. Bringing together people with different approaches to profiling
should hopefully help us get a good picture about what this standard API
could look like; and then also provide a number of profilers and clients
conforming to that API.

To get some idea about the number of people who could be involved in a
project like this, I would like to ask for a quick show of hands: Who would
be interested in contributing code to LLVM-with-profiling? (Either actual
profiling code, or passes that use profiling information.)
Who would want to use the branch, even without contributing?

[Cc'd to a bunch of people who have in the past expressed interest in
profiling with LLVM.]

As far as I know, none of LLVM's standard passes make use of any profiling
information. If we are going to get any value from having profiling
support in LLVM, so that it is worth the effort of maintaining that code,
we need to figure out how we're going to use it.

Right. That's a lot of grunt work, and no one person or group will probably
want to do it, especially if the results will not be very visible to others.

For this reason, our group in Vienna would like to propose a bundling of
various people's efforts in profiling. In particular, we would be willing to
host an LLVM branch that is open to profiling-related contributions such as:
- various kinds of profilers
- patches to preserve/update profiling information across existing passes
- code that improves existing passes based on profiling info
- code that communicates profiling data to the backend (to get MBB profile
data, profile-based spill weights etc.)

The idea is to be more open to such submissions than mainline LLVM is, so
that we can experiment with and compare different approaches. Hopefully,
some parts of this would converge to a state that would then deemed
acceptable for mainline LLVM. Even if not, we would have a common codebase
in which the more boring parts (such as preserving info across simple
passes) are solved once and for all.

I would prefer to go the opposite direction: focus efforts on a single profiling infrastructure in mainline LLVM. You're not going to make much progress on things like preserving profile info across passes that transform the CFG if you have a branch that is even more open than mainline, which already has too many separate profiling formats. But, a branch may still be a good idea, and our efforts may be complementary. Things like the path profiling patches would fit well on your proposed branch.

In my opinion, it is essential that LLVM passes be able to query the profile information with a standard API, and that they also be able to update the profile information to account for changes to the code (e.g., when a block is duplicated, when a loop is unrolled, etc.). We need to have one API for those things regardless of whether the profile information came from __builtin_expect, simple heuristics, static analysis, or some kind of profile feedback.

Absolutely. Bringing together people with different approaches to profiling
should hopefully help us get a good picture about what this standard API
could look like; and then also provide a number of profilers and clients
conforming to that API.

I've got a proposal for what I'd like to do here. I'll send it out separately.

To get some idea about the number of people who could be involved in a
project like this, I would like to ask for a quick show of hands: Who would
be interested in contributing code to LLVM-with-profiling? (Either actual
profiling code, or passes that use profiling information.)
Who would want to use the branch, even without contributing?

I'd really prefer to work on trunk.

In the near term Adam's patch profiling should be checked into llvm/trunk. I think it's important that it live on the same tree as the current llvm profilers with which it's tightly integrated. As we work toward implementing Bob's profiling proposal, it will either need to be integrated with the new framework or it will be removed along with the other stale profilers. Even if it's removed from trunk, the original implementation will still be available in working condition from subversion. We should be able to maintain a reference to it in the latest source or docs.

-Andy

Andy:

This is a reasonable proposal to give developers interested a chance to try Adam’s implementation of path profiling in the short term. We ourselves will be experimenting with the use of this implementation in some code transformations to see if we observe any impact on performance.

I am interested in contributing to an LLVM-with-profiling effort, and agree with the unified API idea. Even if most passes right now do not make use of any kind of profiling, I believe this is mostly due to the fact that integrating transformation passes and profilers is not as smooth a process as it should be.

Right now I am working on implementing speculative PRE using Adam’s path profile, and hopefully by the end of the implementation I will have good suggestions on how to improve this integration.