How to analyse and optimize JIT performance?

I am updating JIT version in my program from MCJIT to ORCJIT, the performance of my tests get worse. I dump the final IR of my added passes, there is no difference between MCJIT and ORCJIT. But the disassembly seems different.

So i am wondering how to analyse the performance of JIT, can we add compile option like “-mllvm -print-after-all”. And is there any general idea to optimize JIT performance.


Or can we user perf tool to analyse C++ JIT program?