Difference between clang -O1 -Xclang -disable-O0-optnone and clang -O0 -Xclang -disable-O0-optnone in LLVM 9

Hello,
I m trying to test individual O3 optimizations/ replicating O3 behavior on IR. I took unoptimized IR (O0), used disable-o0-optnone via (clang -O0 -Xclang -disable-O0-optnone). I read somewhere about clang -O1 -Xclang -disable-O0-optnone, so I also tested on this initial IR.

I have observed by using individual optimizations, the performance (i.e time) is better when the base/initial IR is generated via clang -O1 -Xclang -disable-O0-optnone. In case of clang -O0 -Xclang -disable-O0-optnone initial IR, the performance is reduced.

What is the possible reason for this?

What is the right way?
Please guide.
Thank You

Without actually trying it myself, I would say that the -O1 command line runs optimization passes where the -O0 command line does not. Thus, your “baseline” IR is already somewhat optimized in the -O1 case. If you want to see IR with no optimizations run at all, you want to add -Xclang -disable-llvm-passes to your command line for producing unoptimized IR. I think this would produce the same results for -O0 and -O1.

–paulr

Without actually trying it myself, I would say that the -O1 command line runs optimization passes where the -O0 command line does not. Thus, your “baseline” IR is already somewhat optimized in the -O1 case. If you want to see IR with no optimizations run at all, you want to add -Xclang -disable-llvm-passes to your command line for producing unoptimized IR. I think this would produce the same results for -O0 and -O1.

Except then you’ll get optnone on functions at -O0 (this is so that the -O0 in one compilation is respected in LTO situations where it may be merged with another optimizing compilation)

While I don’t think we have any guarantee that “clang -Xclang -disable-llvm-passes x.cpp -emit-llvm -o x.bc && clang x.bc” (& certainly not if the second step is opt, not clang) - it’s certainly true if you don’t pass the same flags to both the first and second, you may get different output (-O flags being the flagship here - it’s not just the optnone thing, but lifetime markers and other things in Clang’s IRGen that differ depending on optimization level).