Hello,
Like many of you, I have followed the tutorial of Uday on high-performance code generation. This tutorial shows how to obtain very efficient code for loop nests like those of BLAS/TF.
I was wondering, however, how much of what “gcc -O3” does can be reproduced using mlir-opt (with the llvm back-end to object code) before linking using clang/ld? Is there some guide out there?
Best,
Dumitru
Best regards