LLVM2.2 x64 JIT trouble on VStudio build

Hola LLVMers,

I’m debugging through some strangeness that I’m seeing on X64 on windows with LLVM2.2. I had to change the code so that it would engage the x64 target machine on windows builds, but I’ve otherwise left LLVM 2.2 alone. The basic idea is that I’ve got a function bar which is compiled by VStudio and I’m creating another function foo via LLVM JIT which is going to call into bar. This has been working for me for a long time on win32 and also under xcode of course. I’ve included the code that generates the situation at the bottom. Some questions (which may be really brain dead) are:

  1. Why isn’t the stack getting set up in foo prior to the call down into bar?

  2. Why is the call to bar a pointer to a jump. I.e. why didn’t it resolve the address in foo?

  3. What are some good places for me to be looking to try and drill down further on what’s happening? I’ve tried switching calling conventions and have watched it create machine instructions for adjusting the stack up and down, but they seem to be removed by the time it actually gets down to execution time.

Any suggestions would be appreciated.

Thanks,

Chuck.

Call into function (foo)

0000000000980030 mov rax,140001591h

000000000098003A call rax ß this is calling to bar via a jump table

000000000098003C ret

Leads to

0000000140001591 jmp bar (1400064E0h)

Leads to

void bar(int i)

{

00000001400064E0 mov dword ptr [rsp+8],ecx

00000001400064E4 push rdi

00000001400064E5 sub rsp,20h

00000001400064E9 mov rdi,rsp

00000001400064EC mov rcx,8

00000001400064F6 mov eax,0CCCCCCCCh

00000001400064FB rep stos dword ptr [rdi]

00000001400064FD mov ecx,dword ptr [rsp+30h]

printf(“the int is %i\n”,i);

0000000140006501 mov edx,dword ptr [i]

0000000140006505 lea rcx,[string “the int is %i\n” (140C1A240h)]

000000014000650C call qword ptr [__imp_printf (141145920h)]

}

0000000140006512 add rsp,20h

0000000140006516 pop rdi

0000000140006517 ret

At this point, we seem to be jumping back up but the stack is no longer in order, so

000000000098003C ret

Takes us into wonderland

0000000100000003 ???

But unfortunately not through the looking glass.

Here’s the modification of the Fibonacci program which got me the above:

#include “llvm/Module.h”

#include “llvm/DerivedTypes.h”

#include “llvm/Constants.h”

#include “llvm/Instructions.h”

#include “llvm/ModuleProvider.h”

#include “llvm/Analysis/Verifier.h”

#include “llvm/ExecutionEngine/JIT.h”

#include “llvm/ExecutionEngine/Interpreter.h”

#include “llvm/ExecutionEngine/GenericValue.h”

#include “llvm/System/DynamicLibrary.h”

#include “llvm/CallingConv.h”

#include

#include <stdio.h>

using namespace llvm;

void bar(int i)

{

printf(“the int is %i\n”,i);

}

Function* createBarFunction(Module* M)

{

Function* pBarF = cast(M->getOrInsertFunction(“bar”, Type::VoidTy, Type::Int32Ty, NULL));

return pBarF;

}

Function* createFooFunction(Module* M)

{

Function* pBarF = createBarFunction(M),

  • pFooF;

pFooF = cast(M->getOrInsertFunction(“foo”, Type::VoidTy, Type::Int32Ty, NULL));

BasicBlock* pBody = new BasicBlock(“body”,pFooF);

Argument* pArg = pFooF->arg_begin();

pArg->setName(“i”);

std::vector<Value*> barArgs;

barArgs.push_back(pArg);

new CallInst(pBarF, barArgs.begin(), barArgs.end(), “”, pBody);

new ReturnInst(NULL, pBody);

return pFooF;

}

int main(int argc, char **argv) {

// Create some module to put our function into it.

Module *M = new Module(“test”);

M->setDataLayout(“e-p:64:64:64-i1:8:8:8-i8:8:8:8-i32:32:32:32-f32:32:32:32”);

Function* pFooF = createFooFunction(M);

M->print(std::cout);

// Now we going to create JIT

ExistingModuleProvider *MP = new ExistingModuleProvider(M);

ExecutionEngine *EE = ExecutionEngine::create(MP, false);

sys::DynamicLibrary::AddSymbol(“bar”, (void*) bar);

llvm::Module::FunctionListType& funcList = MP->getModule()->getFunctionList();

for (llvm::Module::FunctionListType::iterator i = funcList.begin() ; i != funcList.end() ; ++i)

{

EE->getPointerToFunction(i);

}

EE->recompileAndRelinkFunction(pFooF);

std::vector Args(1);

Args[0].IntVal = APInt(32, 3);

GenericValue GV = EE->runFunction(pFooF, Args);

return 0;

}

Hi Chuck,

It’s hard to tell what’s wrong without having a way to reproduce it since it’s on Windows. Can you dump out IR’s at various places to help debugging this? You can start by dumping out machine instructions and then go back backwards if necessary.

Evan

Hola LLVMers,

I’m debugging through some strangeness that I’m seeing on X64 on windows with LLVM2.2. I had to change the code so that it would engage the x64 target machine on windows builds, but I’ve otherwise left LLVM 2.2 alone. The basic idea is that I’ve got a function bar which is compiled by VStudio and I’m creating another function foo via LLVM JIT which is going to call into bar. This has been working for me for a long time on win32 and also under xcode of course. I’ve included the code that generates the situation at the bottom. Some questions (which may be really brain dead) are:

  1. Why isn’t the stack getting set up in foo prior to the call down into bar?

What is the triplet of the target? x86_64-win32?

  1. Why is the call to bar a pointer to a jump. I.e. why didn’t it resolve the address in foo?

Not sure. I can’t reproduce this. Can you step through the code in X86ISelLowering.cpp::LowerCALL()? Around

// If the callee is a GlobalAddress node (quite common, every direct call is)
// turn it into a TargetGlobalAddress node so that legalize doesn’t hack it.

Evan

Hey Evan,

At the point of the instructions you suggested I step through, X86ISelLowering has this state:

  • this 0x00000000005fe728 {VarArgsFrameIndex=-842150451 RegSaveFrameIndex=-842150451 VarArgsGPOffset=3452816845 …} llvm::X86TargetLowering * const
  • llvm::TargetLowering {TM={…} TD=0x00000000008edac0 IsLittleEndian=true …} llvm::TargetLowering

VarArgsFrameIndex -842150451 int

RegSaveFrameIndex -842150451 int

VarArgsGPOffset 3452816845 unsigned int

VarArgsFPOffset 3452816845 unsigned int

BytesToPopOnReturn 0 int

BytesCallerReserves 0 int

  • Subtarget 0x00000000008eda90 {AsmFlavor=Intel PICStyle=None X86SSELevel=SSE2 …} const llvm::X86Subtarget *
  • llvm::TargetSubtarget {…} llvm::TargetSubtarget

AsmFlavor Intel llvm::X86Subtarget::AsmWriterFlavorTy

PICStyle None llvm::PICStyle::Style

X86SSELevel SSE2 llvm::X86Subtarget::X86SSEEnum

X863DNowLevel -842150451 llvm::X86Subtarget::X863DNowEnum

HasX86_64 true bool

DarwinVers 0 unsigned char

stackAlignment 8 unsigned int

MaxInlineSizeThreshold 128 unsigned int

Is64Bit true bool

HasLow4GUserAddress true bool

TargetType isWindows llvm::X86Subtarget::

if (GlobalAddressSDNode *G = dyn_cast(Callee)) {

// We should use extra load for direct calls to dllimported functions in

// non-JIT mode.

// it get’s into here

if ((IsTailCall || !Is64Bit || // both these are false

getTargetMachine().getCodeModel() != CodeModel::Large) // this is false

&& !Subtarget->GVRequiresExtraLoad(G->getGlobal(), // this is short circuited away

getTargetMachine(), true))

Callee = DAG.getTargetGlobalAddress(G->getGlobal(), getPointerTy()); // this is passed over because the test is false

// since it made it through the if (Global…., it skips down to

// Returns a chain & a flag for retval copy to use.

SDVTList NodeTys = DAG.getVTList(MVT::Other, MVT::Flag);

Thanks!

Chuck.

Ah, ok. x86-64 JIT assumes large code size model, i.e. it cannot assume the GV displacement would fit in the 32-bit direct call field. So it’s using an indirect call. But I think that’s fine.

I think the problem you are running into has to do with JIT function stub. In lazy compilation mode (which is the default), functions are compiled on demand when they are called. So a function call is actually emitted as a call to a target specific function stub. See X86CompilationCallback() in X86JITInfo.cpp. These are the functions that save registers and then call JIT to lazily compile the call destination. I am not sure if it has been tested on x86-64 Windows. Anton, do you know?

Meanwhile, can you try to disable lazy compilation? See ExecutionEngine.h::DisableLazyCompilation().

Evan