JIT pass runtime struct on to subroutines


For a research project at my university, I'm working on incorporating JIT in Prolog, which is basically an interpreted virtual machine.
The VM uses logical units of functionality called 'predicates' which are composed of bytecode that represents the machine instructions the VM supports.
At execution time, the VM infinitately loops over these bytecodes and, using a giant switch statement, executes the functionality for each instruction.

So far, this is all basic VM stuff. I wanted to speed this up by using JIT-compilation on some often-used predicates.
The way I went about this was to first change the code so it uses subroutines.
This means I seperated the code for each machine instruction and turned it into a function.
These functions take a pointer to a struct representing the active machine as a parameter and any number of additional parameters that represent the arguments of that particular machine instruction.
The runtime execution then goes as follows: infinitely get the bytecode and related arguments, then execute the appropriate function with the current machine and the fetched arguments as parameters.

The next step I had in mind would be to construct LLVM-functions at runtime when I wanted to JIT, and then add these (instruction-)function calls to the LLVM function.
For this to be possible, I turned all the machine-instruction functions into a Module by compiling to LLVM IR code.
At runtime I'd create a new llvm::Function for the predicate I wanted to JIT-compile and then loop over every instruction associated with that predicate, adding their respective calls to the predicate-function.

However I'm having some problems to figure out how to do certain things...
I give some code to show what I've got already, some parts are pseudo-code but these are not the functionality I'm having problems with

Hi Adriaan,

if I understand correctly what you want to do, then you might use llvm::Argument in a wrong way.
In order to build a CallInst you should not refer in any way to the arguments (in terms of llvm::Argument) of the Callee (in your case the machine_instr).
The arguments of a function that you get via arg_begin() and arg_end() are used from WITHIN the function.

So in your case that would mean that you would replace parts of your code as follows:

// create the function for the predicate. This should take one parameter: the machine-struct that will be provided at runtime when the function is called
llvm::Function *llvm_function = llvm::cast<llvm::Function>(
      llvm_module->getOrInsertFunction(*name, llvm::Type::getVoidTy(&llvm_context), llvm::PointerType::getUnqual(llvm::StructType::create(&llvm_context)), (llvm::Type *)0));
// prepare for the construction of this function
llvm::BasicBlock *llvm_basic_block = BasicBlock::Create(&llvm_context, "EntryBlock", llvm_function);
llvm::IRBuilder<> llvm_builder(llvm_basic_block);

// pseudo: I'll loop over all the instructions in a predicate. This sets instruction_function_name to the appropriate value in each iteration and makes sure I can iterate over the instruction's parameters
foreach (instruction in predicate) {
     // Fetch the instruction from the already initialised execution engine. I can do this because I compiled all instruction-functions to an LLVM Module and loaded it.
     llvm::Function *machine_instr = llvm_execution_engine->FindFunctionNamed(instruction_function_name);
     // Prepare for setting the parameters of a call instruction
     llvm::Argument *llvm_arg = machine_instr->arg_begin();

llvm::SmallVector<llvm::Value*, 4> llvm_arg;

     // Add a pointer to the machine-struct as a first parameter to every instruction-function call.
     // This should be the same pointer that is passed at runtime to the predicate-function (llvm_function)
     llvm_arg++ = ??;


     // pseudo: for any additional parameters of the machine instruction
     foreach (param in instruction_parameters) {
         // add any additional parameters of the machine instruction to the function call
         llvm_arg++ = param;

llvm_arg.push_back(param); // Whatever param here is...