Question about Middle-end Optimization

We are doing experiment on code optimization and encountered some discrepancy between running clang in a single invocation and running clang → opt → clang. Specifically, what we did is that:

Running clang alone:

clang -Oz -c -o foo.o

Runing clang to generate bytecode, opt to optimize the bytecode and clang to produce the binary.

clang -emit-llvm -Oz -c -o foo.bc
opt foo.bc -o foo.opt.bc
clang -c foo.opt.bc -o foo.o

It is surprising that both produced binaries with exactly the same size: for 200 different programs we tried.

We just wanted to confirm, if -Oz did most of the work for the first phase (code to bytecode) so that invoking -Oz or not does not matter for the last two phases?


clang -Oz -c will generate LLVM IR for the file, attaching a minsize attribute to each function. It’ll then run roughly the same optimizations passes opt would (the way Clang and opt configure the pipeline is slightly different, so there will be small changes).

Also, just running opt with no extra arguments will be a nop, you need to specify -Oz or something so that it actually knows what passes you want to run.

If you really want to split the compilation into 3 phases that don’t overlap you can use something like

clang -emit-llvm -Oz -c -o foo.bc -Xclang -disable-llvm-passes
opt -Oz foo.bc -o foo.opt.bc
clang -c foo.opt.bc -o foo.o

Though, again, the output is unlikely to be identical between single-step and split.


Thank you Tim! This is extremely helpful!