LLVM JIT questions

<ccing llvmdev>

I am experimenting with the LLVM JIT as a future codegenerator for the PyPy JIT. The basics are working which I am extremely happy with!


Maybe you could answer a few questions? (I'm away until monday)

* Is there a way to show the generated code?

Yes, pass -print-machineinstrs into the LLVM command line argument parsing stuff. This will produce a dump of the machine IR right before outputing code with the JIT. It doesn't use the assembly syntax, but is a 1-1 mapping with it and is easy enough to understand.

I'm assuming you're using the JIT library, not 'lli', so you're not using the command line optn parsing stuff natively. As such, something like this should work when your app starts up:

static const char * const Args = { "", "-print-machineinstrs", 0 };
cl::ParseCommandLineOptions(2, Args);

.. this lets you pass other options to LLVM, which could be useful later.

Alternatively, you can just set llvm::PrintMachineCode to 1 (defined in llvm/Target/TargetOptions.h).

* Is there a way to delete a function and the generated code?
  (have not really looked enough into the documentation to find an answer myself for this, sorry if it's an obvious one)
  [probably will not not use this anyway]

Right now you can delete the LLVM function, but not the generated code. This is an often requested feature that will hopefully be implemented soon.

To do this, use something like this:

JIT->freeMachineCodeForFunction(Fn); // currently a noop
Fn->eraseFromParent(); // Deletes the LLVM IR for Fn.

this will only work if nothing uses Fn of course :slight_smile:

* We use ParseAssemblyString() too add code to a module. (this is done by the same code that generates the >100Mb pypy.ll file) Now wishing to use replace some of that code with another version. (after the function has been run already) (pypy might turn out to JIT generate blocks instead of functions). The easiest (and maybe even somewhat logical) way would be if I could call ParseAssemblyString again but that throws an exception (Redefinition of function 'add1'!) What is the best way to do what I want?

As you probably know, ParseAssemblyString isn't the most efficient way to do this. That said, if you really want to use it, and if you know what functions you're parsing, you should be able to get away with something like this (pseudo code):

1. Function *F = M->getNamedFunction("add1");
2. F->setName(""); // now it won't conflict
3. ParseAssemblyString(.., M);
4. Function *F2 = M->getNamedFunction("add1");
5. F->replaceAllUsesWith(F2); // Everything using the old one uses the new one
6. F->eraseFromParent(); // also, take it out of the JIT if it was in it.

They way I was thinking is generate to an empty module, call ParseAssemblyString(), module.getNamedFunction() to be able to copy all the code over to the other module (not working at all probably) and then calling recompileAndRelinkFunction() on the old function. So in effect making it self-modifying code.

This would certainly work. I'll note that the above stuff won't work if you have things that have already codegen'd a call to add1: as you say, you need to use recompileAndRelinkFunction.

To support this, you need to do something like this:

5. F->deleteBody() // Remove body of F.
6. Splice body of F2 into F1, remapping arguments.
7. F2->eraseFromParent();

If you want more details for #6, I can provide them.