IR Direct call emitted to asm indirect call

Hello,

I have a frontend that generates IR that is then jitted with LLJIT. The frontend provides functions from a library that I want to call in the IR. I have access to these functions throught their function pointer and I use them like this, ptr and ftype being the function pointer and the FunctionType respectively :

// Convert the function pointer into a FunctionCallee
Function *tmp = llvm::Function::Create(
                            ftype,
                            Function::ExternalLinkage,
                            "",
                            module.get()
);
Value *i_ptr = builder->getInt64((uintptr_t)ptr);
Value *llvm_ptr = builder->CreateIntToPtr(i_ptr, tmp->getType());
FunctionCallee callee(ftype, llvm_ptr);

// Now make the call
builder->CreateCall(callee, args, "my_call");

For a function returning a pointer and taking an int as argument, the following IR is produced :

%my_call = call ptr inttoptr (i64 139848999881808 to ptr)(i32 42)

Notice that at the time of emitting the IR, the address of the function is known ! However, the produced asm (x86_64) is an indirect call to the function :

// Call to function 1
movabs $0x7ffff138cf97,%rbp
call   *%rbp

// Call to function 2
lea    -0x7e0902(%rbp),%r13
call   *%r13

Is it possible to produce direct calls instead of indirect calls knowing the address of the function at compile time ? Or maybe I am missing something about function pointers ?

I imagine that referencing a function by its name will do symbol resolution and produce a direct call but the names are mangled so I cannot use them.

Any answer will be appreciated,
Thanks.

I think you are trying to make a compiler do the linker’s job. Jit linker should be able to resolve relocations and link new code to existing one. I think you should understand how RuntimeDyld works. As far as I know it allows to load .so files. So you can load a shared library and then just compile code regularly with direct calls to the library and jit linker will do the rest.

I imagine that referencing a function by its name will do symbol resolution and produce a direct call but the names are mangled so I cannot use them.

This is how it should work in general (omitting patchpoints or speculations). I don’t understand, why can’t you use mangled names?

I don’t understand, why can’t you use mangled names?

AFAIK, I should reference functions with the complete name. But the names of my library are mangled (compiled with C++), thus I cannot use the “extern C” name to reference the function. And from what I read, there is no easy way to write a function const char* mangle(const char* name) that would give the mangled name giving the raw name.

Mangling put aside, I tried to test my assumption that referencing functions with their name (make_reservation here) would produce a direct call but, even thought the IR looks better, the produced asm in unchanged.

%my_call = call ptr @make_reservation(i32 42)
...
declare ptr @make_reservation(i32)

Should I do something particular with the ObjectLinkingLayer to have a direct call or is it something I cannot really control ?

So you want to create a call of c++ library function that wasn’t mangled by frontend, right? I can see that C++ names are hardcoded in LLVM: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Analysis/TargetLibraryInfo.def . I don’t know whether there are better alternatives.

BTW, don’t you have access to a mangled name when retrieving it’s real address?

IIRC, jit code is compiled in mcmodel=large by default. So, call instruction has a limitation of a relative address that is encoded into instruction. If the relative address is too large we need to put absolute address inside a register. You may try to set mcmodel=small. But code model really depends on the execution environment.

Function calls within the IR are no longer a problem since my original problem was with the emission of calls into machine code (I probably should have posted in the Code Generation category). But for completeness, the library functions are mangled by the g++ frontend, with some exceptions. Then, my language parser inject the function pointer for IR generation.

Looks that is what I need, but not sure it is feasible. From the LLVM documentation:

In order to keep RuntimeDyld’s implementation simple MCJIT imposed some restrictions on compiled code:

  1. It had to use the Large code model, and often restricted available relocation models in order to limit the kinds of relocations that had to be supported.

I quite don’t understand the internals of the JIT, the differences between RuntimeDyld which is used for MCJIT and JITLink which is used for ORC. Can you point where I can set the mcmodel for ORC with LLJIT ?

I hope you’ve already found it. I don’t have much experience with LLJIT. I see there are bindings for it. I think it is possible. It is definitely possible with “plain” ORC.

LLVMTargetMachineRef LLVMCreateTargetMachine(LLVMTargetRef T,
        const char *Triple, const char *CPU, const char *Features,
        LLVMCodeGenOptLevel Level, LLVMRelocMode Reloc,
        LLVMCodeModel CodeModel);

LLVMOrcJITTargetMachineBuilderRef
LLVMOrcJITTargetMachineBuilderCreateFromTargetMachine(LLVMTargetMachineRef TM) {                                                                                                                                                                                                
  auto *TemplateTM = unwrap(TM);
    
  auto JTMB =
      std::make_unique<JITTargetMachineBuilder>(TemplateTM->getTargetTriple());
    
  (*JTMB)
      .setCPU(TemplateTM->getTargetCPU().str())
      .setRelocationModel(TemplateTM->getRelocationModel())
      .setCodeModel(TemplateTM->getCodeModel())
      .setCodeGenOptLevel(TemplateTM->getOptLevel())
      .setFeatures(TemplateTM->getTargetFeatureString())
      .setOptions(TemplateTM->Options);

  LLVMDisposeTargetMachine(TM);

  return wrap(JTMB.release());
}   

Thanks for the code,
I’ll try it as soon as I have time.

I think this is the answer to my question. The library I am linking to is loaded in a completely different place in memory than the jitted code, so direct call is not possible. This has nothing to do with the way I declare the function into the IR.