Here is the brief example: Compiler Explorer
Simply, GCC would optimize the following code successfully while clang wouldn’t:
int clang_failed(int x, int y) {
if (x == 0)
return y;
else
return clang_failed(x - 1, y * x);
}
GCC would optimize it into a simple form:
clang_failed(int, int):
movl %esi, %eax
testl %edi, %edi
je .L5
.L2:
imull %edi, %eax
subl $1, %edi
jne .L2
.L5:
ret
while clang would translate it honestly.
Originally I thought there might be some optimization passes that clang is missing. But I found this is more interesting after I made some simple testing. For example. both GCC and clang would optimize following code:
int success1(int x, int y) {
if (x == 0)
return y;
else
return success1(x - 1, y++);
}
But clang would optimize following codes while gcc refused to do:
int gcc_failed(int x, int y) {
if (x == 0)
return y;
else
return gcc_failed(x - 1, y + x);
}
BTW, clang would optimize it as:
gcc_failed(int, int): # @gcc_failed(int, int)
mov eax, esi
test edi, edi
je .LBB2_2
lea ecx, [rdi - 1]
lea edx, [rdi - 2]
imul rdx, rcx
imul ecx, ecx
shr rdx
add eax, edi
add eax, ecx
sub eax, edx
.LBB2_2:
ret
which looks a little bit longer than the first version of GCC does. I thought the two codes are homogenized.
So the problem looks interesting. Does anyone knows what’s the pass to optimize it in clang? And if anyone know why clang failed to optimize the first one but succeed with the last one?
Thanks,
Chuanqi