Is there any relationship between IR instruction and execution time

Hi Yin,


As is known to all, there is a relationship between program's instructions and its execution time. In other words, we can estimate the execution time based on the number of program > instructions.

I'm curious about what the relationship between IR instruction and execution time. I know the number of program instructions and the execution time is highly related to the
platform and architecture, while the IR instruction is independent and intermediate. But, intuitively, there may be some relationship between IR instruction and execution time.

Would it be possible to give me some advice about it?

What instructions finally get emitted by the compiler is highly dependent on the specified target. As you pointed out, IR is relatively abstract, and can at best only generate a "rough" estimate to timing. Maybe that loss of fidelity is acceptable in your case. Be aware that there are also target specific optimizations that operate after the IR is lowered to a target friendly representation. Any early approximation of IR performance will be less accurate after target specific optimization passes are ran. For more accurate results, you will need to wait until the IR is lowered to the target architecture and emitted as assembly or object code. But it seems that might be too late for what you are looking for. In any case, if you do want to analyze the assembly code, then look no further than llvm's Machine Code Analyzer(MCA). This tool takes an assembly code as input and generates throughput and latency information. For more details see:


Hi Matt,

Thanks you so much for the reply!

I’ve tried the llvm-mca, it is helpful.
I was wondering whether the llvm-mca support the assembly code for the ARM?

I cross-compile the test file for ARM like that: clang test.c -O2 -target arm-linux-gnueabihf -static -S -o test.s

If I want to check the performance using llvm-mca, is there any option of “-mcpu” for ARM ?



Hi Yin,

MCA does support the –mcpu and –mtriple options. We have one arm test in llvm/test/tools/llvm-mca/ARM for a cortex-9, which is an Out of Order chip.

Hope that helps!


ARM processors are only partially supported by llvm-mca.
At the moment, the tool is unable to resolve variant scheduling classes, and ARM scheduling models often use variant schedling classes to model the latency profile of instructions.

Strictly speaking, what Matt wrote is true: llvm-mca knows how to analyze code for our-of-order processors that have a scheduling model in LLVM.
However, the user experience may be poor for ARM proessors at the moment. It will get better in future (there is a plan to add support for variant scheduling classes; I will send an RFC on the mailing list soon).