Invalid call generated on 64-bit linux when calling native C function from IR

Hi,

when I try to generate LLVM-IR which calls back to native C functions the jit compiler generates invalid code on 64-bit linux. The same code works fine on 32-bit linux, 32-bit OS X and 64-bit OS X. A reproduction case is attached to this mail. It is a simple modification of the “How to use jit” example adding a call to a native function.

I am currently using EE->addGlobalMapping to register the native functions. This appears to be nessecary because the native functions will be part of a .so/.dylib so the jit will not find them using dlsym on linux and we would prefer to stripp all symbols.

I’m using LLVM 2.4, which I compiled using EXTRA_OPTIONS="-m64/-m32 -fPIC".

Below you can see the code generated by the jit on different platforms. On 64-bit linux the ‘call’ instruction uses a completely wrong address. This invalid address appears to be related to the address of the call instruction itself (e.g. it is always a ‘close’ address). To my knowledge the call instruction’s address argument is a relative address and thus needs to be within 2^24 bytes of the instruction. It looks like the code generator should generate a jump using a function pointer in this situation and fails to handle this.

Am I doing something wrong in my code or is this an LLVM bug?

Jan

Linux 64-bit:
(gdb) print addone_addr
$1 = (void *) 0x406018
(gdb) x/10i foo_addr
0x2b7184072030: sub $0x8,%rsp
0x2b7184072034: mov $0x14,%edi
0x2b7184072039: callq 0x2b7200406018 <— absolutely not ok
0x2b718407203e: add $0x8,%rsp
0x2b7184072042: retq
(gdb) x/10i nfoo_addr
0x40603a : push %rbp
0x40603b <nativefoo+1>: mov %rsp,%rbp
0x40603e <nativefoo+4>: mov $0x1e,%edi
0x406043 <nativefoo+9>: callq 0x406018 <— ok
0x406048 <nativefoo+14>: leaveq
0x406049 <nativefoo+15>: retq

OS X 64-bit:
(gdb) print addone_addr
$1 = (void *) 0x100000d56
(gdb) x/10i foo_addr
x/10i foo_addr
0x102080030: sub $0x8,%rsp
0x102080034: mov $0x14,%edi
0x102080039: callq 0x100000d56 <— ok
0x10208003e: add $0x8,%rsp
0x102080042: retq
(gdb) x/10i nfoo_addr
x/10i nfoo_addr
0x100000d78 : push %rbp
0x100000d79 <nativefoo+1>: mov %rsp,%rbp
0x100000d7c <nativefoo+4>: mov $0x1e,%edi
0x100000d81 <nativefoo+9>: callq 0x100000d56 <— ok
0x100000d86 <nativefoo+14>: leaveq
0x100000d87 <nativefoo+15>: retq

Linux 32-bit:
(gdb) print addone_addr
print addone_addr
$1 = (void *) 0x805aa74
(gdb) x/10i foo_addr
x/10i foo_addr
0xc26020: sub $0x4,%esp
0xc26023: movl $0x14,(%esp)
0xc2602a: call 0x805aa74 <— ok
0xc2602f: add $0x4,%esp
0xc26032: ret
(gdb) x/10i nfoo_addr
x/10i nfoo_addr
0x805aa8c : push %ebp
0x805aa8d <nativefoo+1>: mov %esp,%ebp
0x805aa8f <nativefoo+3>: push $0x1e
0x805aa91 <nativefoo+5>: call 0x805aa74 <— ok
0x805aa96 <nativefoo+10>: add $0x4,%esp
0x805aa99 <nativefoo+13>: leave
0x805aa9a <nativefoo+14>: ret

OS X 32-bit:
(gdb) print addone_addr
print addone_addr
$1 = (void *) 0x1e62
(gdb) x/10i foo_addr
x/10i foo_addr
0x2080020: sub $0xc,%esp
0x2080023: movl $0x14,(%esp)
0x208002a: call 0x1e62 <— ok
0x208002f: add $0xc,%esp
0x2080032: ret
(gdb) x/10i nfoo_addr
x/10i nfoo_addr
0x1e80 : push %ebp
0x1e81 <nativefoo+1>: mov %esp,%ebp
0x1e83 <nativefoo+3>: sub $0x18,%esp
0x1e86 <nativefoo+6>: movl $0x1e,(%esp)
0x1e8d <nativefoo+13>: call 0x1e62 <— ok
0x1e92 <nativefoo+18>: leave
0x1e93 <nativefoo+19>: ret

llvmtest.cpp (5.6 KB)

makefile (882 Bytes)

Jan Rehders <wurstgebaeck@googlemail.com> writes:

[snip

I am currently using EE->addGlobalMapping to register the native functions.
This appears to be nessecary because the native functions will be part of a
.so/.dylib so the jit will not find them using dlsym on linux and we would
prefer to stripp all symbols.

I'm using LLVM 2.4, which I compiled using EXTRA_OPTIONS="-m64/-m32 -fPIC".

Seems related to this, which last time I checked was fixed on svn:

http://llvm.org/bugs/show_bug.cgi?id=2920

[snip]