Hi Tobias,
Hi Star,
[...]
Before we write a patch, we should do some profiling to understand where
the overhead comes from. I propose you generate oggenc*16 or even
oggen*32 to ensure we get to about 90% Polly-Detect overhead.I would then run Polly under linux 'perf'. Using 'perf record polly-opt
...' and then 'perf report'. If we are lucky, this points us exactly to
the function we spend all the time in.Cheers,
TobiasThanks for your very useful suggestion.
I have profiled the oggenc*16 and oggenc*32 and the results are listed as follows:oggenc*16: polly-detect compile-time percentage is 71.3%. The top five functions reported by perf are:
48.97% opt opt [.] llvm::TypeFinder::run(llvm::Module const&, bool)
7.43% opt opt [.] llvm::TypeFinder::incorporateType(llvm::Type*)
7.36% opt opt [.] llvm::TypeFinder::incorporateValue(llvm::Value const*)
4.04% opt libc-2.17.so [.] 0x0000000000138bea
2.06% opt [kernel.kallsyms] [k] 0xffffffff81043e6aoggenc*32: polly-detect compile-time percentage is 82.9%. The top five functions reported by perf are:
57.44% opt opt [.] llvm::TypeFinder::run(llvm::Module const&, bool)
11.51% opt opt [.] llvm::TypeFinder::incorporateType(llvm::Type*)
7.54% opt opt [.] llvm::TypeFinder::incorporateValue(llvm::Value const*)
2.66% opt libc-2.17.so [.] 0x0000000000138c02
2.26% opt opt [.] llvm::SlotTracker::processModule()It is surprise that all compile-time for TypeFinder is added into the compile-time for Polly-detect, but I cannot find the any call instructions to TypeFinder in Polly-detect.
Yes, this does not seem very conclusive. We probably need a call graph
to see where those are called.Did you try running 'perf record' with the '-g' option? This should give
you callgraph information, that should be very helpful to track down the
callers in Polly. Also, if you prefer a graphical view of the
results, you may want to have a look at Gprof2Dot [1]. Finally, if this
all does not work, just running Polly in gdb and randomly breaking a
couple of times (manual sampling), may possibly hint you to the right place.
I also tried perf with -g, but it report nothing useful. the result of perf -g is:
- 48.70% opt opt [.] llvm::TypeFinder::run(llvm::Module const&, bool) `
- llvm::TypeFinder::run(llvm::Module const&, bool)
+ 43.34% 0
- 1.78% 0x480031
+ llvm::LoadInst::~LoadInst()
- 1.41% 0x460031
+ llvm::LoadInst::~LoadInst()
- 1.01% 0x18
llvm::BranchInst::~BranchInst()
0x8348007d97fa3d8d
- 0.87% 0x233
+ llvm::GetElementPtrInst::~GetElementPtrInst()
- 0.57% 0x39
+ llvm::SExtInst::~SExtInst()
- 0.54% 0x460032
+ llvm::StoreInst::~StoreInst()
GDB is a useful tool! Thanks for Sebastian's advice!
By setting a break point on llvm::TypeFinder::run(llvm::Module const&, bool), I find most of calling cases are issued from the following two callsites:
0xb7c1c5d2 in polly::ScopDetection::isValidMemoryAccess(llvm::Instruction&, polly::ScopDetection::DetectionContext&) const ()
0xb7c1d754 in polly::ScopDetection::isValidInstruction(llvm::Instruction&, polly::ScopDetection::DetectionContext&) const ()
The detailed backtrace of "isValidMemoryAccess" is:
#0 0x0907b780 in llvm::TypeFinder::run(llvm::Module const&, bool) ()
#1 0x08f76ebe in llvm::TypePrinting::incorporateTypes(llvm::Module const&) ()
#2 0x08f76fc9 in llvm::AssemblyWriter::init() ()
#3 0x08f77176 in llvm::AssemblyWriter::AssemblyWriter(llvm::formatted_raw_ostream&, llvm::SlotTracker&, llvm::Module const*, llvm::AssemblyAnnotationWriter*) ()
#4 0x08f79d1a in llvm::Value::print(llvm::raw_ostream&, llvm::AssemblyAnnotationWriter*) const ()
#5 0xb7c1d044 in polly::ScopDetection::isValidInstruction(llvm::Instruction&, polly::ScopDetection::DetectionContext&) const ()
from /home/star/llvm/llvm_build/tools/polly/Release+Asserts/lib/LLVMPolly.so
#6 0xb7c1ea75 in polly::ScopDetection::allBlocksValid(polly::ScopDetection::DetectionContext&) const ()
from /home/star/llvm/llvm_build/tools/polly/Release+Asserts/lib/LLVMPolly.so
#7 0xb7c1f4aa in polly::ScopDetection::isValidRegion(polly::ScopDetection::DetectionContext&) const ()
from /home/star/llvm/llvm_build/tools/polly/Release+Asserts/lib/LLVMPolly.so
#8 0xb7c1fd16 in polly::ScopDetection::findScops(llvm::Region&) ()
from /home/star/llvm/llvm_build/tools/polly/Release+Asserts/lib/LLVMPolly.so
#9 0xb7c1fd81 in polly::ScopDetection::findScops(llvm::Region&) ()
from /home/star/llvm/llvm_build/tools/polly/Release+Asserts/lib/LLVMPolly.so
#10 0xb7c206f7 in polly::ScopDetection::runOnFunction(llvm::Function&) ()
from /home/star/llvm/llvm_build/tools/polly/Release+Asserts/lib/LLVMPolly.so
#11 0x09065fdd in llvm::FPPassManager::runOnFunction(llvm::Function&) ()
#12 0x09067e2b in llvm::FunctionPassManagerImpl::run(llvm::Function&) ()
#13 0x09067f6d in llvm::FunctionPassManager::run(llvm::Function&) ()
#14 0x081e6040 in main ()
>Also, can you upload the .ll file somewhere, such that I can access it?
(Please do not attach it to the email)
I have attached the source code of oggenc.c and oggen.ll in the bug r16624:
http://llvm.org/bugs/show_bug.cgi?id=16624
Best wishes,
Star Tan