How to understand "tail" call?

Hi,

    %1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds
([30 x i8], [30 x i8]* @.str, i64 0, i64 0), i32 10)
    %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds
([30 x i8], [30 x i8]* @.str.1, i64 0, i64 0), i32 10)

When I use -O2 to compile the following C code, I will get the above
IR code. I don't understand the explanation of tail-call-optimization
in the langref. Could anybody help me understand what tail call means?
Thanks.

https://llvm.org/docs/CodeGenerator.html#tail-call-optimization

#include <stdio.h>

int main(void) {
    const int local = 10;
    int *ptr = (int*) &local;
    printf("Initial value of local : %d \n", local);
    *ptr = 100;
    printf("Modified value of local: %d \n", local);
    return 0;
}

In its simplest form, when the last thing a function does is call
something else then the compiler can deallocate the caller's stack
frame early and just jump to the callee rather than actually calling
it.

For example:

    void foo() {
        [...]
        bar();
     }

Without the tail-call optimization this might produce:

     foo:
         sub $16, %rsp ; Allocate stack space for local variables
         [...]
         callq bar
         add $16, %rsp ; Free up local stack space
         retq

With tail call optimization it would become:

     foo:
         sub $16, %rsp ; Allocate stack space for local variables
         [...]
         add $16, %rsp ; Free up local stack space
         jmpq bar

Then when bar returns, it actually grabs the return address intended
for foo and returns to the correct place. The main benefit is that
while bar (and its nested functions) are running, foo isn't consuming
any stack space.

There are certain embellishments that basically allow "return bar();"
as well if the arguments are compatible, but that's the basic idea.

Many functional languages that handle loop-like constructs via
recursion idioms instead demand their compilers guarantee this
optimization will occur, otherwise they'd blow their stack for fairly
small iteration counts on loops.

In the example you gave, no tail call optimization can occur despite
the calls being marked as candidates because main has to materialize
the "return 0" code after both calls to printf have happened. If you
returned the result of the last printf instead it would probably
trigger.

Cheers.

Tim.

In the example you gave, no tail call optimization can occur despite
the calls being marked as candidates because main has to materialize
the "return 0" code after both calls to printf have happened.

Since the tail call optimization can not be trigger any way, why the
generated IR still has the keyword "tail"? Isn't that unnecessary?

If you look at https://llvm.org/docs/LangRef.html#id1509, the "tail"
keyword is a suggestion together with a promise about what the callee
does (not do) with the caller's stack.

In that light it makes sense for a pass that has discovered a call
satisfies those constraints to mark it with "tail" so that if later
simplifications mean it ends up in tail position CodeGen knows it's
safe to do the optimization. Otherwise you'd have to do some
potentially fairly sophisticated inter-procedural analysis right when
you were deciding whether to make a tail call; that's definitely
something we'd discourage in LLVM.

Cheers.

Tim.