# Query on optimization and tail call.

Hi,

While playing around on the LLVM, I tried this code:

int sum(int n)
{
if (n == 0)
return 0;
else
return n + sum(n-1);
}

and this is what "llvm-gcc -O2" gave me:

define i32 @sum(i32 %n) nounwind {
entry:
%tmp215 = icmp eq i32 %n, 0 ; <i1> [#uses=1]
br i1 %tmp215, label %bb10, label %tailrecurse.bb10_crit_edge

tailrecurse.bb10_crit_edge: ; preds = %entry
%tmp = add i32 %n, -1 ; <i32> [#uses=3]
%tmp17 = mul i32 %tmp, %tmp ; <i32> [#uses=1]
%tmp18 = add i32 %tmp17, %n ; <i32> [#uses=1]
%tmp. = zext i32 %tmp to i64 ; <i64> [#uses=2]
%tmp19 = add i64 %tmp., -1 ; <i64> [#uses=1]
%tmp20 = mul i64 %tmp19, %tmp. ; <i64> [#uses=1]
%tmp21 = lshr i64 %tmp20, 1 ; <i64> [#uses=1]
%tmp.22 = trunc i64 %tmp21 to i32 ; <i32> [#uses=1]
%tmp24 = sub i32 %tmp18, %tmp.22 ; <i32> [#uses=1]
ret i32 %tmp24

bb10: ; preds = %entry
ret i32 0
}

which is great! It computes the sum as n*(n+1)/2. However, when I try just this:

int sum(int n)
{
return n + sum(n-1);
}

it generates this:

define i32 @sum(i32 %n) nounwind {
entry:
%tmp2 = add i32 %n, -1 ; <i32> [#uses=1]
%tmp3 = tail call i32 @sum( i32 %tmp2 ) nounwind ; <i32> [#uses=1]
%tmp5 = add i32 %tmp3, %n ; <i32> [#uses=1]
ret i32 %tmp5
}

Why isn't llvm able to eliminate the tail call in this (simpler) case?

Regards,

I haven't checked, but this is probably a case that wasn't worth
handling, since it's an infinite loop.

-Eli

Eli Friedman wrote:

int sum(int n)
{
return n + sum(n-1);
}

Sorry, but this is not a tail recursive call. There is some computation involved (the addition) after the inner call to sum! The continuation of the inner call to sum is not the continuation of the called sum.

In other words, the equivalent in languages or implementations claiming to know about tail recursion (Scheme, Ocaml) is not compiled like tail cails.

So I believe that llvm should not handle something which is not a tail call like a tail call!

Hi,

int sum(int n)
{
return n + sum(n-1);
}

this recursion never stops.

Ciao,

Duncan.

Thanks for the clarifications (and unnecessary interruptions on your
time!). But one more query please!

This one generates optimized (computes the value) code:

unsigned sum(unsigned n)
{
if (n == 0)
return 0;
else
return n + sum(n-1);
}

where as this one generates a loop that adds up numbers:

int sum(int n)
{
if (n <= 0)
return 0;
else
return n + sum(n-1);
}

I'm guessing this is probably because the values will be different in
the overflow case -- is it?. Although changing the if condition to "if
(n <= 0 || n > 10)" also generates the loop.