fastcc, tail calls, and gcc

Two related questions.

This is with LLVM 2.4 doing a JIT compile to x86-64. (I generate LLVM
IR using an IRBuilder instance, compile/optimize, and then call
getPointerToFunction() to get a "native" function pointer.)

(1) My reading of various mailing list messages seems to indicate
that a function marked as using the "fastcc" calling convention
("CallingConv::Fast") cannot be called directly from GCC-generated
code (n.b. -- standalone gcc, not llvm-gcc) because the fastcc calling
convention is, in general, incompatible with GCC (which I assume uses
the "CallingConv::C" calling convention).

Correct? If not, how do I call a LLVM JIT-generated fastcc function
from a function statically compiled by GCC?

(2) Why does the x86-64 JIT backend generate a "ret $0x8" instruction
to return from a fastcc function that is (a) marked as fastcc
(CallingConv::Fast); but (b) takes no arguments and returns 'void'?

The function type is this:
  std::vector<const Type*> args; /* empty */
  FunctionType *ft = FunctionType::get(Type::VoidTy, args, false);

The fastcc generated code ends with this:
      c20800 ret $0x8

However, if I instead mark the very same function to use the usual
CallingConv::C calling convention, then the generated code ends
with this:
      c3 ret

I assume the "ret 0x8" is meant to be the "callee pops args" portion
of the fastcc convention, but in this case the function has no
arguments (nor a return value), so why should 8 bytes be popped from
the stack on return?

Thanks for any help.

-- Jeff Kuskin

Jeff Kuskin wrote:

Correct? If not, how do I call a LLVM JIT-generated fastcc function
from a function statically compiled by GCC?

Well, you can always generate a little wrapper function with C calling
convention which just calls the fastcc function.

You can do a quick bit of assembly code to make sure that the arguments are in the right registers for the call.

-eric

Two related questions.

(2) Why does the x86-64 JIT backend generate a "ret $0x8" instruction
to return from a fastcc function that is (a) marked as fastcc
(CallingConv::Fast); but (b) takes no arguments and returns 'void'?

fastcc generated code ends with this:

     c20800 ret $0x8
I assume the "ret 0x8" is meant to be the "callee pops args" portion
of the fastcc convention, but in this case the function has no
arguments (nor a return value), so why should 8 bytes be popped from
the stack on return?

If i remember correctly this has to do with stack alignment and tail
calls. Note that to support tail calls between functions that have an
arbitrary number of arguments the stack pointer of the caller of the
tail calling function is modified.
e.g if foo(i64) tail calls bar() the stack pointer of foo's caller
would be adjusted by 8 bytes which could result in a misaligned stack
(assuming a platform alignment of 16) on entry to the function bar.
Hence when tailcallopt is enabled the size occupied by arguments is
rounded up such that such a misalignment cant happen.

Two related questions.

This is with LLVM 2.4 doing a JIT compile to x86-64. (I generate LLVM
IR using an IRBuilder instance, compile/optimize, and then call
getPointerToFunction() to get a "native" function pointer.)

(1) My reading of various mailing list messages seems to indicate
that a function marked as using the "fastcc" calling convention
("CallingConv::Fast") cannot be called directly from GCC-generated
code (n.b. -- standalone gcc, not llvm-gcc) because the fastcc calling
convention is, in general, incompatible with GCC (which I assume uses
the "CallingConv::C" calling convention).

Correct?

Yes.

If not, how do I call a LLVM JIT-generated fastcc function
from a function statically compiled by GCC?

You also JIT compile a shim function that is exposed with the C calling
convention but contains a fastcc call to your internal function. Note that
you may also need to rejig argument passing with things like sret.

Albert Graef wrote:

Jeff Kuskin wrote:

Correct? If not, how do I call a LLVM JIT-generated fastcc function
from a function statically compiled by GCC?

Well, you can always generate a little wrapper function with C calling
convention which just calls the fastcc function.

I use the fastcall convention all the time.
LLVM-jitted code calling GCC-compile code and vice-versa.

This works for x86 (32 bit):

void* llvm_jit_compile();

typedef __attribute__((fastcall)) int (*func_ptr)(int p1, int p2);

int g(void) {
     func_ptr f = (func_ptr)llvm_jit_compile();
     return f(1, 2);
}

Mark

I use the fastcall convention all the time.
LLVM-jitted code calling GCC-compile code and vice-versa.

fastcall != fastcc. There are more or less definite rules for fastcall
CC. fastcc, oppositely, has no such rules. The only rule is "as fast
as possible". This means, that such functions cannot be exposed 'to
public'. CC details for such functions can be changed at any time (as
it was already 3 or 4 times).

I am trying this now. Thanks to all for the suggestions.

-- Jeff

From: Arnold Schwaighofer <arnold.schwaighofer@gmail.com>
Subject: Re: [LLVMdev] fastcc, tail calls, and gcc
To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Date: Thursday, February 12, 2009, 6:56 PM
> Two related questions.

> (2) Why does the x86-64 JIT backend generate a
"ret $0x8" instruction
> to return from a fastcc function that is (a) marked as
fastcc
> (CallingConv::Fast); but (b) takes no arguments and
returns 'void'?
fastcc generated code ends with this:
> c20800 ret $0x8
> I assume the "ret 0x8" is meant to be the
"callee pops args" portion
> of the fastcc convention, but in this case the
function has no
> arguments (nor a return value), so why should 8 bytes
be popped from
> the stack on return?
If i remember correctly this has to do with stack
alignment and tail
calls. Note that to support tail calls between functions
that have an
arbitrary number of arguments the stack pointer of the
caller of the
tail calling function is modified.
e.g if foo(i64) tail calls bar() the stack pointer of
foo's caller
would be adjusted by 8 bytes which could result in a
misaligned stack
(assuming a platform alignment of 16) on entry to the
function bar.
Hence when tailcallopt is enabled the size occupied by
arguments is
rounded up such that such a misalignment cant happen.

Hmmm. I think I understand, but I don't see how the "ret 8" is
correct for a function that has no arguments. In this case,
it seems to me that the "size occupied by the arguments" is zero,
and should remain zero even after rounding up. Perhaps I
misunderstand. Let me try to generate an actual testcase
with real C and x86 asm code.

Thanks.

-- Jeff