Why does clang not always produce constant value for same static constexpr

Given the code below, clang produces a constant value for test1, test2
and test4. Why doesn't it for test3?
This is more of a curious query, than a request for help, but if someone
does have the answer, I'd appreciate as much detail as possible.

    >staticconstexprintcount_x(constchar*str){intcount{};for(;*str
    !=0;++str){count +=*str
    =='x';}returncount;}#defineSTRx1"123456789x"#defineSTRx4STRx1STRx1STRx1STRx1#defineSTRx8STRx4STRx4#defineSTRx16STRx8STRx8inttest1(){returncount_x(STRx4);}inttest2(){returncount_x(STRx8);}inttest3(){returncount_x(STRx16);}inttest4(){constexprautok
    =count_x(STRx16);returnk;}|

    >test1():# @test1()mov eax,4ret test2():# @test2()mov eax,8ret
    test3():# @test3()xor eax,eax mov dl,49mov ecx,offset
    .L.str.2+1.LBB2_1:# =>This Inner Loop Header: Depth=1xor esi,esi cmp
    dl,120sete sil add eax,esi movzx edx,byte ptr [rcx]add rcx,1test
    dl,dl jne .LBB2_1 ret test4():# @test4()mov eax,16ret
    .L.str.2:.asciz
    "123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x123456789x"|

gcc does:

    >test1():mov eax,4ret test2():mov eax,8ret test3():mov eax,16ret
    test4():mov eax,16ret|

Compilation command lines used:

clang++-Ofast-std=c++2a-S -o --c src/test.cpp |grep -Ev$'^\t+\\.'gcc9

-Ofast-std=c++2a-S -o --c src/test.cpp |grep -Ev$'^\t+\\.'|

Compiler Explorer: https://godbolt.org/z/V-3MEp

constexpr is a red herring here - except in 4, where you’ve used the constexpr keyword to create a constexpr context, in 1-3 these are just normal function calls the compiler optimizes as it sees fit - and it seems it saw fit to unroll and optimize to a constant cases 1 and 2, but not case 3 (perhaps because it was too long/some other middle-end optimization decided to bail out).

I couldn’t say for sure exactly which LLVM optimization bailed out early, or whether LLVM is using the same general approach as GCC here.

Adding/removing the constexpr keyword from count_x shouldn’t affect anything in cases 1-3 (in either Clang or GCC, really). But looks like it makes a big difference to GCC - perhaps GCC tries to evaluate constexpr in the frontend even when the language doesn’t require it. Sounds like a recipe for some problematic compile-time to me… but don’t know.

  • Dave

Thank you Dave. I have an understanding of constexpr evaluation, and realise the compiler is free to do what it likes in all but test4… I suppose I’d really like to know if there is an actual limit/threshold in place. If test3 is changed to use 100 characters it does as I expect, any more than that e.g. 101 and it bails. I also compiled with -Rpass-analysis=’.*’ -mllvm -print-after-all, and it seems the bail is in Induction Variable Simplify / Scalar Evolution, but I assume the actual problem could be before that.

Thank you Dave. I have an understanding of constexpr evaluation, and realise the compiler is free to do what it likes in all but test4… I suppose I’d really like to know if there is an actual limit/threshold in place. If test3 is changed to use 100 characters it does as I expect, any more than that e.g. 101 and it bails. I also compiled with -Rpass-analysis=’.*’ -mllvm -print-after-all, and it seems the bail is in Induction Variable Simplify / Scalar Evolution, but I assume the actual problem could be before that.

Possible it’s somewhere else, but without further evidence Induction Variable Simplify/Scalar Evolution sounds like a perfectly plausible place where such a threshold could be implemented so far as I know/could guess.