Clang-cl optimization option

MagentaTreehouse · June 12, 2024, 7:34pm

On Windows, I encountered a benchmark where, compiled with clang-cl, -Xclang -O3 produces code about 2.5x as fast as just /O2. I found that /O2 only enables -O2 (by viewing invokation using -v). Does anyone know why clang-cl only enables -O2 instead of -O3 with /O2 ? Is it by design?

clang version 18.1.7
Target: x86_64-pc-windows-msvc

See benchmark code:

(Note: Performance results are specific to clang-cl with the target triple specified above and therefore cannot be reproduced on quick-bench. The 2.5x speedup upon -Xclang -O3 applys to bmPushBack.)

AaronBallman · June 13, 2024, 11:31am

CC @hansw2000

I think this may have been an oversight (but maybe I’m wrong). MSVC has no /O3, but documents /O2 as optimizing for maximum speed, and /O1 as optimizing for minimum code size.

Based on that, I would expect /O1 to map to -Oz and /O2 (and /Ot) to map to -O3.

hansw2000 · June 13, 2024, 12:16pm

The /O flags are complicated. We fiddled with them a lot originally, but they’ve been pretty stable since [clang-cl] Handle -O correctly · llvm/llvm-project@015ce0f · GitHub

As mentioned in the docs, /O2 corresponds to /Og (no effect) /Oi (use intrinsics) /Ot (optimize for speed - we map that to -O2) /Oy (omit frame pointer on x86) /Ob2 (-finline-functions) /GF (string pooling) /Gy (like -ffunction-sections). The main logic is in TranslateOptArg.

So the question boils down to whether we should map /Ot to -O3 instead of -O2. We probably should. I’ll draft a patch.

hansw2000 · June 13, 2024, 1:35pm

Topic		Replies	Views
Difference between clang -O1 -Xclang -disable-O0-optnone and clang -O0 -Xclang -disable-O0-optnone in LLVM 9 LLVM Dev List Archives	2	189	November 13, 2019
Phoronix numbers for clang-omp compiler OpenMP	1	72	June 4, 2014
Clang++ and opt produce different results Beginners clang	0	217	April 7, 2022
Some clang benchmarking on Windows Clang Frontend	10	130	June 24, 2012
clang vs gcc on ROOT... Clang Frontend	1	89	October 22, 2010

Clang-cl optimization option

Related topics