Does LLVM optimize recursive call?

Hi list,

I have a simple question about LLVM.

I learned that we need to use iterations than recursions in C programming.
That is because recursion is expensive. It can easily consume out all
the stack given to a program. And the call/return consumes time much
more.

But I've read a sentence that nowadays compilers can optimize
recursions in C or whatever languages not to consume heavy stack and
call/return.

I wonder what kind of optimization this is and if LLVM support this
kind of optimization.

I am not a compiler expert. Please consider this. ^^;

Thank you in advance

The short answer is: sometimes.

Some recursive code may be transformed into something similar to a
loop, but sometimes the compiler won't be able to figure it out &
it'll remain recursive.

- David

Hi David Blaikie and others who might be interested on this

Thank you very much!

#1. Then I'd like to know, to make Clang/LLVM optimize a recursion
into an iteration, how a recursion has to be implemented with any
compiler option? (if the language is C/C++)

Clang uses recursions, especially it uses recursive decent and
operator-precedence parser.
#2. I wonder if this kind of recursion is optimized to a iteration.

Sorry my question is getting deeper. ^^;

Thank you very much in advance

Journeyer J. Joh

Hi David Blaikie and others who might be interested on this

Thank you very much!

#1. Then I'd like to know, to make Clang/LLVM optimize a recursion
into an iteration, how a recursion has to be implemented with any
compiler option? (if the language is C/C++)

Simple recursive algorithms are likely to be optimized to use iteration
even at -O1. You can verify this with, for example:

struct A {
  unsigned value;
  struct A *next;
};

unsigned count(struct A *a) {
  if (!a) return 0;
  return a->value + count(a->next);
}

Note that this algorithm was not originally iterative; changing it
to use accumulation reassociates the arithmetic and would not
be legal for, say, floating point.

Clang uses recursions, especially it uses recursive decent and
operator-precedence parser.
#2. I wonder if this kind of recursion is optimized to a iteration.

No, and it might not be an optimization even if we did it. The
recursion vs. iteration performance tradeoffs are far more complex
than "we need to use iterations [rather] than recursions".

Transforming recursion to iteration can be critical, e.g. to avoid blowing
out the stack. It can also be good for performance, e.g. when an
algorithm can be easily adapted to use constant space and would
otherwise spend a high proportion of its time in call overhead. But
the general recursion-to-iteration transformation basically involves
emulating the call stack on the heap and often makes performance
worse, not better.

If your teachers are really making this sort of superficial generalization,
they're doing you a disservice.

John.

I completely agree with John here - we had a case a few years ago where someone spent ages refactoring their algorithm to be iterative and was surprised that it became slower (with the MS compiler). Looking at the assembly, it was using push and pop instructions for the temporaries on the recursive version, but had to do more complex addressing to find the right place for them in the iterative version, which made everything slower (and prevented the CPU from doing some register-stack aliasing tricks).

One thing John didn't mention is that LLVM will try to do tail call optimisation. This means that if your function ends with something like:

return a(b);

It will attempt to turn this into a jump, rather than a call. This doesn't happen in all cases, because it is dependent on the calling convention. For static functions in C, the compiler is free to chose whatever calling convention it wants (because it's not part of the public API), but for others it may need to use a non-standard calling convention.

This has the advantage that it doesn't consume extra stack space (especially if it's self-recursive, it only consumes one stack frame no matter how deep the recursion goes) and the return will return to the outer caller. It is slightly more complex, because most C calling conventions require the caller to clean up the call frame (the callee can't clean it up in the general case, because C allows variadic functions), and so a tail call optimisation must ensure that the inner callee has the same layout of call frame as its caller.

David

Hello David Chisnall, John McCall, David Blaikie and other people who
may concern

If your teachers are really making this sort of superficial generalization,
they're doing you a disservice.

This hit my head.
Actually I didn't test the performance of those two, the iteration and
the recursion, at all. And in this field of study, testing is very
much important, I believe.

And the '''tail call optimization'''. I will remember this also.

Thank you all.

I learned a new knowledge today.

Sincerely
Journeyer J. Joh

Also known as 'tail recursion' -- it's an important concept to understand.

It depends on this pattern:

ReturnVal
AFunction( <args> )
{
   if(<args> == terminal)
     return ReturnVal.null
   ReturnVal LocalResult = <computation>(<args>)
   return LocalResult {composed with} AFunction(<args>.next);
}

The possibility of optimization out recursion comes because there's no
state to be stacked. Whatever temporaries that are needed for
<computation> can be reused without stacking up local states. It's
equivalent to

ReturnVal AFunction( <args> )
{
    ReturnVal accum = init;
    Args current;
    while ( ( current = <args>.front ) != terminal )
      {
      accum = accum <compose> current;
      <args>.pop_front;
      }
  return accum;
}

The reason tail recursion isn't more widely used and taught is that
the mapping onto iterative solutions is so straightforward, knowing
how to solve problems iteratively seems expedient and more useful.

On the other hand, a deep understanding of induction is one thing that
separates literate programmers from hacks.