Re: [LLVMdev] A simulation tool.eml (2.8 KB)
The perfctr Linux kernel patch virtualizes the CPU's performance
counters (so that each thread saves/restores the performance counter
registers when it is taken off/put on the CPU). This, combined with the
perfex command line tool, allows you to find the number of cache misses,
cycles executed, etc. for a program at native execution speed.
Perfctr's advantage is speed; however, I don't believe it has been
incorporated into a tool that gives the detailed report information that
cachegrind seems to provide; perfex will only report the number of
events between program start and program termination. One could write
an LLVM pass that instruments a program with calls to a profiling
runtime; that runtime could use the perfctr driver to collect the
performance counter information on a
So, perfctr is faster. Cachegrind is probably much easier to
install/use and looks like it will provide more detailed information. Both are open-source and publicly available.
oprofile will annotate your source code with all kinds of information
obtained from the CPU performance counters, for example showing the
percentage of time / cache misses in a particular function.