standalone llvm

Is it possible to get llvm to generate native machine code
without using gcc and friends ? Do I use lli ?

I'd like to directly create executable code that i can
stick in memory somewhere and jump into (call).

(I'm looking to use llvm in a BSD licensed project).

Simon.

Is it possible to get llvm to generate native machine code
without using gcc and friends ? Do I use lli ?

llc. llc --help lists all the options. it compiles llvm bytecode files.

Is it possible to get llvm to generate native machine code
without using gcc and friends ? Do I use lli ?

LLVM only needs llvm-gcc to translate from C/C++ to LLVM IR. If you already have code in LLVM IR form (e.g. because you're generating it on the fly or you have your own front-end) you don't need llvm-gcc.

I'd like to directly create executable code that i can
stick in memory somewhere and jump into (call).

Take a look at the llvm/examples directory. There are several small programs that create LLVM IR on the fly and JIT compile it.

-Chris

It seems this does not yet work on X86. Are there some restrictions I could
adhere to in order to get something to work ?

Simon.

$ llc -filetype=dynlib -f -o=helloworld.so helloworld.bc
llc: target 'X86' does not support generation of this file type!

$ llc -filetype=obj -f -mcpu=i386 -march=x86 -o=helloworld.o helloworld.bc
llc: ELFWriter.cpp:81: virtual void llvm::ELFCodeEmitter::addRelocation(const llvm::MachineRelocation&): Assertion `0 && "relo not handled yet!"' failed.
llc((anonymous namespace)::PrintStackTrace()+0x1a)[0x8ac5736]
llc((anonymous namespace)::SignalHandler(int)+0xed)[0x8ac59cb]
[0xffffe420]
/lib/tls/libc.so.6(abort+0x1d2)[0xb7dc7fa2]
/lib/tls/libc.so.6(__assert_fail+0x10f)[0xb7dc02df]
llc(llvm::ELFCodeEmitter::getConstantPoolEntryAddress(unsigned int)+0x0)[0x88ef650]
llc((anonymous namespace)::Emitter::emitGlobalAddressForPtr(llvm::GlobalValue*, int)+0x77)[0x866ac5d]
llc((anonymous namespace)::Emitter::emitInstruction(llvm::MachineInstr const&)+0xc0b)[0x866c1b1]
llc((anonymous namespace)::Emitter::emitBasicBlock(llvm::MachineBasicBlock const&)+0xac)[0x866a9f8]
llc((anonymous namespace)::Emitter::runOnMachineFunction(llvm::MachineFunction&)+0x115)[0x866a859]
llc(llvm::MachineFunctionPass::runOnFunction(llvm::Function&)+0x29)[0x85f3421]
llc(llvm::FunctionPassManagerT::runPass(llvm::FunctionPass*, llvm::Function*)+0x1f)[0x89e39b5]
llc(llvm::PassManagerT<llvm::FTraits>::runPasses(llvm::Function*, std::map<llvm::Pass*, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> >, std::less<llvm::Pass*>, std::allocator<std::pair<llvm::Pass* const, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> > > > >&)+0x13f)[0x89e52b7]
llc(llvm::PassManagerT<llvm::FTraits>::runOnUnit(llvm::Function*)+0x17f)[0x89e4d11]
llc(llvm::FunctionPassManagerT::runOnFunction(llvm::Function&)+0x25)[0x89e5583]
llc(llvm::FunctionPass::runOnModule(llvm::Module&)+0xa7)[0x8989c63]
llc(llvm::ModulePassManager::runPass(llvm::ModulePass*, llvm::Module*)+0x1f)[0x89e3b95]
llc(llvm::PassManagerT<llvm::MTraits>::runPasses(llvm::Module*, std::map<llvm::Pass*, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> >, std::less<llvm::Pass*>, std::allocator<std::pair<llvm::Pass* const, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> > > > >&)+0x13f)[0x89e5e1b]
llc(llvm::PassManagerT<llvm::MTraits>::runOnUnit(llvm::Module*)+0x17f)[0x89e5875]
llc(llvm::ModulePassManager::runOnModule(llvm::Module&)+0x25)[0x8988cc9]
llc(llvm::PassManager::run(llvm::Module&)+0x23)[0x898906d]
llc(main+0xd8c)[0x84f50b0]
/lib/tls/libc.so.6(__libc_start_main+0xf4)[0xb7db3974]
llc[0x84f4281]
Aborted

llc. llc --help lists all the options. it compiles llvm bytecode
files.

It seems this does not yet work on X86. Are there some restrictions I could
adhere to in order to get something to work ?

Simon.

$ llc -filetype=dynlib -f -o=helloworld.so helloworld.bc
$ llc -filetype=obj -f -mcpu=i386 -march=x86 -o=helloworld.o helloworld.bc

These two filetypes aren't supported yet. You have to produce a .s file t hen use a system assembler to produce a .o file:

$ llc -f -o helloworld.s helloworld.bc
$ as helloworld.s
$ ls -l helloworld.o

-Chris

/lib/tls/libc.so.6(__assert_fail+0x10f)[0xb7dc02df]
llc(llvm::ELFCodeEmitter::getConstantPoolEntryAddress(unsigned int)+0x0)[0x88ef650]
llc((anonymous namespace)::Emitter::emitGlobalAddressForPtr(llvm::GlobalValue*, int)+0x77)[0x866ac5d]
llc((anonymous namespace)::Emitter::emitInstruction(llvm::MachineInstr const&)+0xc0b)[0x866c1b1]
llc((anonymous namespace)::Emitter::emitBasicBlock(llvm::MachineBasicBlock const&)+0xac)[0x866a9f8]
llc((anonymous namespace)::Emitter::runOnMachineFunction(llvm::MachineFunction&)+0x115)[0x866a859]
llc(llvm::MachineFunctionPass::runOnFunction(llvm::Function&)+0x29)[0x85f3421]
llc(llvm::FunctionPassManagerT::runPass(llvm::FunctionPass*, llvm::Function*)+0x1f)[0x89e39b5]
llc(llvm::PassManagerT<llvm::FTraits>::runPasses(llvm::Function*, std::map<llvm::Pass*, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> >, std::less<llvm::Pass*>, std::allocator<std::pair<llvm::Pass* const, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> > > > >&)+0x13f)[0x89e52b7]
llc(llvm::PassManagerT<llvm::FTraits>::runOnUnit(llvm::Function*)+0x17f)[0x89e4d11]
llc(llvm::FunctionPassManagerT::runOnFunction(llvm::Function&)+0x25)[0x89e5583]
llc(llvm::FunctionPass::runOnModule(llvm::Module&)+0xa7)[0x8989c63]
llc(llvm::ModulePassManager::runPass(llvm::ModulePass*, llvm::Module*)+0x1f)[0x89e3b95]
llc(llvm::PassManagerT<llvm::MTraits>::runPasses(llvm::Module*, std::map<llvm::Pass*, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> >, std::less<llvm::Pass*>, std::allocator<std::pair<llvm::Pass* const, std::vector<llvm::Pass*, std::allocator<llvm::Pass*> > > > >&)+0x13f)[0x89e5e1b]
llc(llvm::PassManagerT<llvm::MTraits>::runOnUnit(llvm::Module*)+0x17f)[0x89e5875]
llc(llvm::ModulePassManager::runOnModule(llvm::Module&)+0x25)[0x8988cc9]
llc(llvm::PassManager::run(llvm::Module&)+0x23)[0x898906d]
llc(main+0xd8c)[0x84f50b0]
/lib/tls/libc.so.6(__libc_start_main+0xf4)[0xb7db3974]
llc[0x84f4281]
Aborted

-Chris

I'm trying to take assembly and create machine code I can execute.
How close am I ?

Simon.

int main() {
  Module *M = NULL;

  char *AsmString = "; ModuleID = 'test'\n\
\n\
implementation ; Functions:\n\
\n\
int %add1(int %AnArg) {\n\
EntryBlock:\n\
        %addresult = add int 1, %AnArg ; <int> [#uses=1]\n\
        ret int %addresult\n\
}\n\
";

  M = ParseAssemblyString(AsmString, NULL);

  ExistingModuleProvider* MP = new ExistingModuleProvider(M);
  ExecutionEngine* EE = ExecutionEngine::create(MP, false);

  std::cout << "We just constructed this LLVM module:\n\n" << *M;

  Function *F = M->getNamedFunction("add1");

  assert(F!=NULL);

  int (*add1)(int);

  add1 = (int (*)(int))EE->getPointerToFunction(F);

  std::cout << "Got:" << add1(55) << "\n"; // <----------- Bombs here <---------------

  return 0;
}

Simon Burton <simon@arrowtheory.com> writes:

I'm trying to take assembly and create machine code I can execute.
How close am I ?

Your test case is not complete. Besides, which version of llvm are you
using? What are the commands for compiling and linking your test case?
How it bombs?

Do you #include "llvm/ExecutionEngine/JIT.h" ?

Simon Burton <simon@arrowtheory.com> writes:

> I'm trying to take assembly and create machine code I can execute.
> How close am I ?

Your test case is not complete. Besides, which version of llvm are you
using? What are the commands for compiling and linking your test case?
How it bombs?

Do you #include "llvm/ExecutionEngine/JIT.h" ?

Hi Oscar,

I'm using llvm CVS, and manage to compile/link OK. Yes I include JIT.h.
The program segfaults when it gets to calling the function pointer.

From the Makefile:

llvmjit: llvmjit.o
  g++ llvmjit.o /home//users//simonb//lib/LLVMAsmParser.o /home//users//simonb//lib/LLVMInterpreter.o `llvm-config --ldflags` `llvm-config --libs jit` -lpthread -ldl -o llvmjit

llvmjit.o: llvmjit.cpp
  g++ `llvm-config --cxxflags` -c llvmjit.cpp

Complete source (i added a call to verifyModule):

#include "llvm/Module.h"
#include "llvm/Constants.h"
#include "llvm/Type.h"
#include "llvm/Instructions.h"
#include "llvm/ModuleProvider.h"
#include "llvm/ExecutionEngine/ExecutionEngine.h"
#include "llvm/ExecutionEngine/JIT.h"
#include "llvm/ExecutionEngine/GenericValue.h"

#include "llvm/Assembly/Parser.h"
#include "llvm/Analysis/Verifier.h"

#include <iostream>
using namespace llvm;

int main() {
  Module *M = NULL;

  char *AsmString = "; ModuleID = 'test'\n\
\n\
implementation ; Functions:\n\
\n\
int %add1(int %AnArg) {\n\
EntryBlock:\n\
        %addresult = add int 1, %AnArg ; <int> [#uses=1]\n\
        ret int %addresult\n\
}\n\
";

  M = ParseAssemblyString(AsmString, NULL);

  std::cout << "verifyModule: " << verifyModule( *M ) << "\n";

  ExistingModuleProvider* MP = new ExistingModuleProvider(M);
  ExecutionEngine* EE = ExecutionEngine::create(MP, false);

  std::cout << "We just constructed this LLVM module:\n\n" << *M;

  Function *F = M->getNamedFunction("add1");

  assert(F!=NULL);

  int (*add1)(int);

  add1 = (int (*)(int))EE->getPointerToFunction(F);

  std::cout << "Got:" << add1(55) << "\n";

  return 0;
}

Simon Burton <simon@arrowtheory.com> writes:

I'm using llvm CVS, and manage to compile/link OK. Yes I include JIT.h.
The program segfaults when it gets to calling the function pointer.

From the Makefile:

llvmjit: llvmjit.o
  g++ llvmjit.o /home//users//simonb//lib/LLVMAsmParser.o /home//users//simonb//lib/LLVMInterpreter.o `llvm-config --ldflags` `llvm-config --libs jit` -lpthread -ldl -o llvmjit

llvmjit.o: llvmjit.cpp
  g++ `llvm-config --cxxflags` -c llvmjit.cpp

Complete source (i added a call to verifyModule):

[snip]

Simon,

With a fresh CVS checkout, I've tried your test case on Windows/VC++
and it works ok. Too bad that I have not access to a Linux machine
right now. I'll like to see what's wrong with your test case.

What you get from running the test case under gdb and inspecting the
value of add1 just before the function invocation?

There are several possibilities here: either add1 is assigned a NULL
pointer, or LLVM was unable to use the JIT and generates bytecode
instead of native code, or invalid native code was generated
(unlikely).

What happens when you execute your function the same way the Fibonacci
example does? (See examples/Fibonacci/fibonacci.cpp line 112).

...

Simon,

With a fresh CVS checkout, I've tried your test case on Windows/VC++
and it works ok. Too bad that I have not access to a Linux machine
right now. I'll like to see what's wrong with your test case.

What you get from running the test case under gdb and inspecting the
value of add1 just before the function invocation?

There are several possibilities here: either add1 is assigned a NULL
pointer, or LLVM was unable to use the JIT and generates bytecode
instead of native code, or invalid native code was generated
(unlikely).

Well, it's not NULL:

(gdb) print add1
$1 = (int (*)(int)) 0x83e43b8
(gdb) print ((char*)add1)
$2 = 0x83e43b8 "h\2245\b\002"

What happens when you execute your function the same way the Fibonacci
example does? (See examples/Fibonacci/fibonacci.cpp line 112).

That works OK.

Simon.

Simon Burton <simon@arrowtheory.com> writes:

There are several possibilities here: either add1 is assigned a NULL
pointer, or LLVM was unable to use the JIT and generates bytecode
instead of native code, or invalid native code was generated
(unlikely).

Well, it's not NULL:

(gdb) print add1
$1 = (int (*)(int)) 0x83e43b8
(gdb) print ((char*)add1)
$2 = 0x83e43b8 "h\2245\b\002"

Dissamsembling that address would reveal if llvm really created binary
code for your function.

What happens when you execute your function the same way the Fibonacci
example does? (See examples/Fibonacci/fibonacci.cpp line 112).

That works OK.

This indicates that the JIT is not working and your code is being
interpreted. I don't know about "llvm-config --libs jit" you are using
but I would try adding LLVMJIT.o to your link command the same way you
do with LLVMInterpreter.o

Oh, and renaming your llvmjit.cpp to something else, just in case. (Is
the linker case-sensitive?)

Actually if I _remove_ those other library link arguments and just left the
ones provided by "llvm-config --libs" it works fine (yay!). Except it's a huge
executable: 168Mb. And the linker warns about some symbols:
/usr/bin/ld: `.gnu.linkonce.t._ZNK4llvm14TargetLowering12getValueTypeEPKNS_4TypeE' referenced in section `.rodata' of /home//users//simonb//lib/LLVMPowerPC.o: defined in discarded section `.gnu.linkonce.t._ZNK4llvm14TargetLowering12getValueTypeEPKNS_4TypeE' of /home//users//simonb//lib/LLVMPowerPC.o

etc.

If instead I use "llvm config --libs jit asmparser" it compiles but segfaults in the usual
place. Here are the libs it uses in this case:
$ llvm-config --libnames jit asmparser
LLVMAsmParser.o LLVMJIT.o LLVMExecutionEngine.o LLVMCodeGen.o LLVMSelectionDAG.o libLLVMAnalysis.a libLLVMTarget.a libLLVMTransformUtils.a libLLVMipa.a libLLVMAnalysis.a libLLVMTarget.a libLLVMTransformUtils.a libLLVMipa.a LLVMCore.o libLLVMSupport.a libLLVMSystem.a LLVMbzip2.o

Somewhat better is when I use "--libs engine asmparser". It works, and generates
an exe around 77Mb. And no complaints when linking.

thanks for your help Oscar.

Simon.

Note that if you're interested in reducing library size, the easiest thing to do is to use a release build instead of a debug build (or just strip your debug libraries). Most of the space is consumed by debug info.

-Chris