I am looking for a tool (in Linux or Windows) that allow me to get performance measures like cycle execution, cache accesses, etc. for an x86 architecture. I want to estimate the performance overhead due to the modification that I do using LLVM.
Any suggestion is welcome.
Thanks in advance,
You mean 'cachegrind'?
Valgrind: Tool Suite
I don't know any public tool better than this (but someone please tell
me if I am misinformed).
Oprofile for Linux is a pretty good alternative.
It uses hardware performance counters to collect profiling information
and therefore has very low overhead, whereas Valgrind performs dynamic
binary instrumentation and can be significantly slow (20-50x slower).
In addition, Cachegrind 'simulates' cache behavior through it's own
cache model, whereas Oprofile (or other counter based profilers)
report real cache events.
Depending on what your needs are (ease of use, runtime overhead, etc)
you could pick either.
Helps if I send it to the list....
I have never used CodeAnalyst first-hand, but the slow-down figures
that you quote lead me to believe that it must use hardware
performance counters. Instrumentation based profilers rarely, if ever,
display such low overhead. Also, instrumentation based profilers
cannot profile kernel routines, unless there is explicit support from
within the kernel (such as in Sun Solaris 10 and DTrace).
Taking a quick look at AMD's website seems to confirm this theory:
If this topic is getting out-of-scope for the LLVM Dev list, we can
take it offline.