This isn’t upstream, but it’s pretty straightforward to add to RunnerUtils if you want to experiment. Just add the two functions print_flops, rtclock to lib/ExecutionEngine/RunnerUtils.cpp. Here is an example. You can then call these from your MLIR file.
We had this question before and the ability to “benchmark” code seems generally useful. Do we expect to have benchmarks to avoid performance regressions in MLIR at some point?
In any case, this seems useful enough to upstream.
I hope so, we should likely look into integrating with the rest of LLVM test suite and LNT, unless someone has a better proposal to track performance upstream