[PATCH] & Question: Preserving ProfileInfo for backend.

Hi,

the second part of my work is to preserve the profiling information
through all the transformation passes and make it available to the
backend machinery.

Attached is an example patch on how I plan to preserve the information
for a given transformation pass.

And now comes the question into place: whats the best way to attach the
profile info also the MachineBlocks and MachineFunctions? I was thinking
of converting the ProfileInfo into a template and using it for both
BasicBlocks and MachineBasicBlocks.

And where is the best point to transfer this information from the
bytecode CFG to the machinecode CFG?

Thanks, Andi

- --

llvm-r81204.preserve.profiling.info.patch (2.11 KB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

the second part of my work is to preserve the profiling information
through all the transformation passes and make it available to the
backend machinery.

Attached is an example patch on how I plan to preserve the information
for a given transformation pass.

At a brief glance, this looks good. It would be helpful to override the
verifyAnalysis method from the Pass class to verify that the profiling
information has been kept current.

And now comes the question into place: whats the best way to attach the
profile info also the MachineBlocks and MachineFunctions? I was thinking
of converting the ProfileInfo into a template and using it for both
BasicBlocks and MachineBasicBlocks.

And where is the best point to transfer this information from the
bytecode CFG to the machinecode CFG?

The SelectionDAGBuild phase is the only place where the precise relationship
between the BasicBlock CFG and the MachineBasicBlock CFG is
known. It would also be possible to do the transfer in a later phase, though that
would require a fair amount of guesswork to determine how the two CFGs
correspond.

This does point out a limitation of doing instrumentation at the LLVM IR level;
it won't be able to cover branches inserted during CodeGen.

Dan

Hi,

Does the current LLVM backend support reading in profile information
(without preserving across transformations)? An earlier poster

http://groups.google.com/group/llvm-dev/browse_thread/thread/4bd65dbe84394bb7

noted that accessing execution counts in a MachineFunction pass (using
the BasicBlock* corresponding to the respective MachineBasicBlock)
returned 0 for all blocks. Running llc with --debug-pass=Structure I
noticed that the NoProfileInfo pass was being executed. I tried
adding a ProfileLoaderPass in the addPreRegAlloc function of the X86
target machine to load the profile information but receive the
following runtime error when the pass manager attempts to add the
ProfileLoader pass:

llc: <path to llvm>/llvm/lib/VMCore/PassManager.cpp:1597: virtual void
llvm::ModulePass::assignPassManager(llvm::PMStack&,
llvm::PassManagerType): Assertion `!PMS.empty() && "Unable to find
appropriate Pass Manager"' failed.

I'm not very familiar with the inner workings of the pass manager
framework. Is there a simple fix that can allow existing profile
information to be loaded by backend passes? I realize that the
profile data would not be completely accurate, but as a first order
approximation it could be useful until the proper framework is
implemented?

Thanks!

Hi,

Shuguang Feng wrote:

Does the current LLVM backend support reading in profile information
(without preserving across transformations)? An earlier poster

Yes, it does.

http://groups.google.com/group/llvm-dev/browse_thread/thread/4bd65dbe84394bb7

noted that accessing execution counts in a MachineFunction pass (using
the BasicBlock* corresponding to the respective MachineBasicBlock)
returned 0 for all blocks. Running llc with I
noticed that the NoProfileInfo pass was being executed.

Yes, llc currently does not support the loading of profiles, but I
attach a patch that does that, can you try that please?

I tried
adding a ProfileLoaderPass in the addPreRegAlloc function of the X86
target machine to load the profile information but receive the
following runtime error when the pass manager attempts to add the
ProfileLoader pass:

llc: <path to llvm>/llvm/lib/VMCore/PassManager.cpp:1597: virtual void
llvm::ModulePass::assignPassManager(llvm::PMStack&,
llvm::PassManagerType): Assertion `!PMS.empty() && "Unable to find
appropriate Pass Manager"' failed.

I'm not very familiar with the inner workings of the pass manager
framework. Is there a simple fix that can allow existing profile
information to be loaded by backend passes? I realize that the
profile data would not be completely accurate, but as a first order
approximation it could be useful until the proper framework is
implemented?

Don't know about Passes in the backend, but this could be a problem of
an FunctionPassManager trying to use a ModulePass.

Andi

llvm-r81350.llc.profile.loader.patch (2.15 KB)

Thanks for such a rapid response!

Don't know about Passes in the backend, but this could be a problem of
an FunctionPassManager trying to use a ModulePass.

I manually applied the patch you provided for llc (I'm using the 2.5
release of LLVM not ToT) and it fixed my compilation error. When your
patch replaced the FunctionPassManager used by llc with a PassManager
the error went away.

Unfortunately, I'm still seeing execution counts of -1 when I print
them out in my MachineFunction pass. I access the profiling
information at each MachineBasicBlock with the following code, where
"bb" is a reference to the current MachineBasicBlock:

PI->getExecutionCount(bb.getBasicBlock())

I believe I've integrated all the ProfileInfo* files from ToT with my
LLVM-2.5 installation properly. The profiling code (and llvm-prof)
seems to be working since llvm-prof is generating/printing the
appropriate execution frequencies. Is there an obvious mistake that I
could be making? Since I've had to customize my current installation
of llvm I would like to avoid updating to the latest revision if
possible.

Thanks!

Hi,

Shuguang Feng wrote:

Thanks for such a rapid response!

Don't know about Passes in the backend, but this could be a problem of
an FunctionPassManager trying to use a ModulePass.

I manually applied the patch you provided for llc (I'm using the 2.5
release of LLVM not ToT) and it fixed my compilation error. When your
patch replaced the FunctionPassManager used by llc with a PassManager
the error went away.

Unfortunately, I'm still seeing execution counts of -1 when I print
them out in my MachineFunction pass. I access the profiling
information at each MachineBasicBlock with the following code, where
"bb" is a reference to the current MachineBasicBlock:

PI->getExecutionCount(bb.getBasicBlock())

What does "llc -debug-pass=Structure" say? Is the ProfileLoaderPass
really the last pass to touch the ProfileInfo before you are using it?

Also, does bb.getBasicBlock() for sure always returns a valid block
refrerence?

You are getting the PI by getAnalysis<ProfileInfo>() I presume? Is this
really the instance created by ProfileLoaderPass?

(I guess for the last two questions its best to use gdb, are you
familiar with it?)

Andi

What does "llc -debug-pass=Structure" say? Is the ProfileLoaderPass
really the last pass to touch the ProfileInfo before you are using it?

Below is the sequence of passes that I see. Although the
NoProfileInfo pass is being run, it should be subsequently overridden
by ProfileInfoLoaderPass (LoaderPass) correct?

Target Data Layout
Create Garbage Collector Module Metadata
Basic Alias Analysis (default AA impl)
DWARF Information Writer
No Profile Information
Module Information
  ModulePass Manager
    Profiling information loader
    FunctionPass Manager
      Preliminary module verification
      Dominator Tree Construction
      Module Verifier
      Natural Loop Construction
      Canonicalize natural loops
      Scalar Evolution Analysis
      Loop Pass Manager
        Loop Strength Reduction
      Lower Garbage Collection Instructions
      Remove unreachable blocks from the CFG
      Optimize for code generation
      Insert stack protectors
      X86 DAG->DAG Instruction Selection
      X86 FP_REG_KILL inserter
      X86 Maximal Stack Alignment Calculator
      <MY PASS RUNS HERE>

Also, does bb.getBasicBlock() for sure always returns a valid block
refrerence?

Yes. I am printing bb and *bb.getBasicBlock() in order to compare the
contents of the IR in the BasicBlock and the target assembly in the
MachineBasicBlock.

You are getting the PI by getAnalysis<ProfileInfo>() I presume? Is this
really the instance created by ProfileLoaderPass?

Yes, I have "PI = &getAnalysis<ProfileInfo>()" in my code (modeled
after BasicBlockPlacement.cpp). However, when I run gdb the value of
the Pass* pointer returned by createProfileLoaderPass() does not match
the value of PI (of type ProfileInfo*) that I see inside my
MachineFunctionPass. The abbreviated output of gdb is found below:

Breakpoint 1, main (argc=11, argv=0xbfffd394) at <path to llvm>/tools/
llc/llc.cpp:292
292 Pass* tmp = createProfileLoaderPass();
(gdb) p tmp
$1 = (class llvm::Pass *) 0x3573000
(gdb) c
Continuing.

Breakpoint 2, main (argc=11, argv=0xbfffd394) at <path to llvm>/tools/
llc/llc.cpp:293
293 Passes.add(tmp);
(gdb) p tmp
$2 = (class llvm::Pass *) 0x8feeaf0

So the address of the ProfileLoaderPass should be 0x8feeaf0 correct?
But I see the following inside my own pass:

Breakpoint 3, MyCodeGenPass::runOnMachineFunction (this=0x90be200,
MF=@0x90ca280) at <path to llvm>/lib/Target/X86/MyCodeGenPass.cpp:108
108 <random line of code after PI =
&getAnalysis<ProfileInfo>() executes>
(gdb) p PI
$3 = (class llvm::ProfileInfo *) 0x90be438

(I guess for the last two questions its best to use gdb, are you
familiar with it?)

I have a working knowledge :slight_smile: but haven't used any bells and whistles.

Thanks for your help.

Shuguang Feng wrote:

What does "llc -debug-pass=Structure" say? Is the ProfileLoaderPass
really the last pass to touch the ProfileInfo before you are using it?

Below is the sequence of passes that I see. Although the
NoProfileInfo pass is being run, it should be subsequently overridden
by ProfileInfoLoaderPass (LoaderPass) correct?

Yes.

Target Data Layout
Create Garbage Collector Module Metadata
Basic Alias Analysis (default AA impl)
DWARF Information Writer
No Profile Information
Module Information
  ModulePass Manager
    FunctionPass Manager
      Preliminary module verification
      Dominator Tree Construction
      Module Verifier
      Natural Loop Construction
      Canonicalize natural loops
      Scalar Evolution Analysis
      Loop Pass Manager
        Loop Strength Reduction
      Lower Garbage Collection Instructions
      Remove unreachable blocks from the CFG
      Optimize for code generation
      Insert stack protectors
      X86 DAG->DAG Instruction Selection
      X86 FP_REG_KILL inserter
      X86 Maximal Stack Alignment Calculator
      <MY PASS RUNS HERE>

Also, does bb.getBasicBlock() for sure always returns a valid block
refrerence?

Yes. I am printing bb and *bb.getBasicBlock() in order to compare the
contents of the IR in the BasicBlock and the target assembly in the
MachineBasicBlock.

You are getting the PI by getAnalysis<ProfileInfo>() I presume? Is this
really the instance created by ProfileLoaderPass?

Yes, I have "PI = &getAnalysis<ProfileInfo>()" in my code (modeled
after BasicBlockPlacement.cpp). However, when I run gdb the value of
the Pass* pointer returned by createProfileLoaderPass() does not match
the value of PI (of type ProfileInfo*) that I see inside my
MachineFunctionPass. The abbreviated output of gdb is found below:

Breakpoint 1, main (argc=11, argv=0xbfffd394) at <path to llvm>/tools/
llc/llc.cpp:292
292 Pass* tmp = createProfileLoaderPass();
(gdb) p tmp
$1 = (class llvm::Pass *) 0x3573000
(gdb) c
Continuing.

Breakpoint 2, main (argc=11, argv=0xbfffd394) at <path to llvm>/tools/
llc/llc.cpp:293
293 Passes.add(tmp);
(gdb) p tmp
$2 = (class llvm::Pass *) 0x8feeaf0

So the address of the ProfileLoaderPass should be 0x8feeaf0 correct?
But I see the following inside my own pass:

Breakpoint 3, MyCodeGenPass::runOnMachineFunction (this=0x90be200,
MF=@0x90ca280) at <path to llvm>/lib/Target/X86/MyCodeGenPass.cpp:108
108 <random line of code after PI =
&getAnalysis<ProfileInfo>() executes>
(gdb) p PI
$3 = (class llvm::ProfileInfo *) 0x90be438

I *guess* this two pointer should point to the same object, this could
explain why the ProfileInfo you are reading is not the expected one. Can
anyone from the list confirm this?

It *is* allowed to access ModulePass analysis information from an
FunctionPass?

Can you try to manually override the PI value in the
MyCodeGenPass::runOnMachineFunction() to the value seen in llc and then
access the ProfileInfo?

(I guess for the last two questions its best to use gdb, are you
familiar with it?)

I have a working knowledge :slight_smile: but haven't used any bells and whistles.

Worked fine enough!

Andi

It *is* allowed to access ModulePass analysis information from an
FunctionPass?

BasicBlockPlacement (a FunctionPass) also accesses the profile
information and I assumed it worked (but haven't independently
verified this).

Can you try to manually override the PI value in the
MyCodeGenPass::runOnMachineFunction() to the value seen in llc and then
access the ProfileInfo?

Good suggestion. Unfortunately the end result is that for some blocks
instead of seeing the sentinel value of "-1" I see other bogus
execution counts instead. For example, llvm-prof prints out the
following as the most frequently executed basic block:

## %% Frequency
  1. 4.80749% 18002906/3.74476e+08 inflate_stored() - bb20

but in my pass the frequency I see from PI->getExecutionCount
(bb.getBasicBlock()) for the exact same BasicBlock (bb20 from function
inflate_stored()) is 7.47821e-316. I verified that PI is indeed
pointing to the same object created in llc.cpp with the following gdb
trace:

Breakpoint 1, main (argc=11, argv=0xbfffd394) at <path to llvm>/tools/
llc/llc.cpp:293
293 Passes.add(tmp);
(gdb) p tmp
$1 = (class llvm::Pass *) 0x8feeaf0
(gdb) c
Continuing.

Breakpoint 2, MyCodeGenPass::runOnMachineFunction (this=0x90be200,
MF=@0x90ca280) at <path to llvm>/lib/Target/X86/MyCodeGenPass.cpp:100
100 <random line of code after executing PI = (ProfileInfo*)
0x8feeaf0>
(gdb) p PI
$2 = (class llvm::ProfileInfo *) 0x8feeaf0
(gdb) clear
Deleted breakpoint 2
(gdb) c
Continuing.

I will go back through my files and make sure I didn't do anything
silly when I merged the latest ProfileInfo* code with my LLVM-2.5
codebase.

I finally got a chance to sit down and stare at this again today.

From what I can tell the ProfileInfoLoaderPass (LoaderPass) is

executing properly. However, when I call &getAnalysis<ProfileInfo>()
I'm actually receiving a handle to the NoProfileInfo pass despite the
ordering of the passes that I see:

Target Data Layout
Create Garbage Collector Module Metadata
Basic Alias Analysis (default AA impl)
DWARF Information Writer
No Profile Information <---------- *This is being
returned to me by getAnalysis<ProfileInfo>
Module Information
  ModulePass Manager
    Profiling information loader <---------- *This is what
I want a handle to
    FunctionPass Manager
      Preliminary module verification
      Dominator Tree Construction
      Module Verifier
      Natural Loop Construction
      Canonicalize natural loops
      Scalar Evolution Analysis
      Loop Pass Manager
        Loop Strength Reduction
      Lower Garbage Collection Instructions
      Remove unreachable blocks from the CFG
      Optimize for code generation
      Insert stack protectors
      X86 DAG->DAG Instruction Selection
      X86 FP_REG_KILL inserter
      X86 Maximal Stack Alignment Calculator
      <MY PASS RUNS HERE>

I'm guessing that this happens because both LoaderPass and
NoProfileInfo are part of the same AnalysisGroup (ProfileInfo) and
NoProfileInfo pass was registered as the *default* implementation. I
couldn't find how to disable NoProfileInfo from running so I modified
the source code to make LoaderPass the default. This allowed me to
grab the right handle in my MachineFunction pass but also lead me to
wonder 3 things:

1) Is my explanation for what was happening correct?
2) If so, what is the proper way to select between different
implementations of an AnalysisGroup?
3) I couldn't find anything in the source tree that explicitly called
createNoProfileInfoPass() so why is NoProfileInfo always being
executed? Does this have to do with it being an ImmutablePass?

Thanks!