Redefining function

Hi everybody.

I’ve just started learning about LLVM and didn’t get too far studying the core.
I couldn’t find the solution to my problem (if it has one) in the mailing list or the source code. The problem is: how can I redefine a function that’s been called already by some other function?

Suppose I have 3 files, all compiled to bytecode through llvm-gcc (I think it could be clang instead).

File1.c:
void do_print() { print(); }

File2.c:
void print() { printf(“File2.c\n”); }

File3.c:
void print() { printf(“File3.c\n”); }

Then, I have the main file compiled to executable like this:
int main() {
// initialize and get context (not shown)
Module *file1(LoadFile(“file1.bc”,Context));
Module *file2(LoadFile(“file2.bc”,Context));
Module *file3(LoadFile(“file3.bc”,Context));

Linker::LinkModules(file1, file2, NULL);

EngineBuilder builder(file1);
ExecutionEngine *EE = builder.create();

EE->runStaticConstructorsDestructors(false);

func = EE->FindFunctionNamed(“do_print”);
EE->runFunction(func, std::vector());

//swap the definition of the function “print” from the one in File2.c to File3.c
swap (file1, file2, file3);

EE->runFunction(func, std::vector());

EE->runStaticConstructorsDestructors(true);

return 0;
}

I can get everything before the swap working (if I comment the rest, the output is OK). I’ve tried to build the “swap” function many times but I can’t get it to work.
The expected output is:
File2.c
File3.c

If someone know how to do this or knows it’s impossible, I would be very thankful. I don’t even know if this is the best way to do it.

Miranda

Hi Conrado,

I couldn't find the solution to my problem (if it has one) in the mailing list or the source code. The problem is: how can I redefine a function that's been called already by some other function?

why do you want to do this?

Suppose I have 3 files, all compiled to bytecode through llvm-gcc (I think it could be clang instead).

File1.c:
void do_print() { print(); }

File2.c:
void print() { printf("File2.c\n"); }

File3.c:
void print() { printf("File3.c\n"); }

The solution in C is to give the version in File2 a weak linkage type,
for example using gcc's "weak" attribute. You then link all three files
together, and the weak print in File2 will be magically replaced with the
non-weak print in File3.

  //swap the definition of the function "print" from the one in File2.c to File3.c
  swap (file1, file2, file3);

If all the functions are in the same module, then you can use
FunctionA->replaceAllUsesWith(FunctionB) if they have the same
type.

Ciao,

Duncan.

Hi Duncan,

I couldn’t find the solution to my problem (if it has one) in the mailing list or the source code. The problem is: how can I redefine a function that’s been called already by some other function?

why do you want to do this?

To implement something that is common in Lisp. Suppose I have a program that is running and can’t be stopped or the cost being stoped is prohibitive. If I find a better way to run an algorithm, I’d like to update the running program non-stopping.

Suppose I have 3 files, all compiled to bytecode through llvm-gcc (I think it could be clang instead).

File1.c:
void do_print() { print(); }

File2.c:
void print() { printf(“File2.c\n”); }

File3.c:
void print() { printf(“File3.c\n”); }

The solution in C is to give the version in File2 a weak linkage type,
for example using gcc’s “weak” attribute. You then link all three files
together, and the weak print in File2 will be magically replaced with the
non-weak print in File3.

Never heard of it before. After a quick reading, it sounds OK. Keeping the rest of the file, changed the attribute and this section:

func = EE->FindFunctionNamed (“do_print”);
EE->runFunction(func, std::vector ());
Linker::LinkModules(main_file, print2, &ErrorMessage);
EE->runFunction(func, std::vector ());

And now I get this error before the second runFunction:

While deleting: void ()* %print
An asserting value handle still pointed to this value!
UNREACHABLE executed at /home/miranda/llvm-2.6/lib/VMCore/Value.cpp:492!

I suppose that’s because “do_print” was already called. By the way, it seems like weak attribute is only supported for ELF and a.out. Maybe not the better solution.

//swap the definition of the function “print” from the one in File2.c to File3.c
swap (file1, file2, file3);

If all the functions are in the same module, then you can use
FunctionA->replaceAllUsesWith(FunctionB) if they have the same
type.

Sorry I didn’t see that function before. But, when I tried that (pastebin code: http://pastebin.com/m2485ae4f), it still doesn’t print as supposed. It calls only the first function printing
File2.c
File2.c

Maybe that works when the functions haven’t been called before. Am I using the wrong way or had to do something before?

Thanks,

Miranda

Conrado Miranda wrote:

To implement something that is common in Lisp. Suppose I have a program
that is running and can't be stopped or the cost being stoped is
prohibitive. If I find a better way to run an algorithm, I'd like to
update the running program non-stopping.

The way I do this in Pure is to always call global functions in an
indirect fashion, using an internal global variable which holds the
current function pointer. When a function definition gets updated, the
Pure interpreter just jits the new function, changes the global variable
accordingly, and frees the old code.

Compared to Duncan's suggestion, this has the advantage that you only
have to recompile the function which was changed. AFAICT, if you use
replaceAllUsesWith, then the changes ripple through so that you might
end up re-jiting most of your program.

Albert

Albert Graef wrote:

The way I do this in Pure is to always call global functions in an
indirect fashion, using an internal global variable which holds the
current function pointer. When a function definition gets updated, the
Pure interpreter just jits the new function, changes the global variable
accordingly, and frees the old code.

Compared to Duncan’s suggestion, this has the advantage that you only
have to recompile the function which was changed. AFAICT, if you use
replaceAllUsesWith, then the changes ripple through so that you might
end up re-jiting most of your program.

Thought of that before, but I was trying to do it more elegantly and transparent to the program (which is being write in C/C++). Maybe going back to that.

Thank you both for the quick replies.

Miranda

PS:
If it’s any help, got the svn version and, while running the program, got this:
The JIT doesn’t know how to handle a RAUW on a value it has emitted.
UNREACHABLE executed at /home/conrado/engines/llvm/lib/ExecutionEngine/JIT/JITEmitter.cpp:1542!

I looked at the function and it’s a dummy function. Just looking forward to see that corrected.

The problem here is reasonably complicated. With the JIT, you have two
different worlds that aren't automatically in sync: the IR in your
program, and the machine code generated for one version of that IR.

runFunction(F) is a wrapper around getPointerToFunction(F), which
returns the address of some machine code implementing the function.
runFunction() does _not_ free this machine code when it returns, so
subsequent runFunction() calls don't need to re-compile it, but they
also get the original definition even if the function has changed. The
JIT will automatically destroy the machine code when F is destroyed,
or you can destroy it manually with freeMachineCodeForFunction().

If you have an existing IR function A which calls function B, and
you've emitted A to machine code, then you have a machine code call to
B in there. Now you want A to call C instead. Without the above
assert, it would be relatively easy to change the IR to call C: call
B->replaceAllUsesWith(C). However, you still have the machine code for
A, which calls B, and there could be a thread concurrently executing
A, making it unsafe to modify A's code. So what should the JIT do when
it sees you replacing B with C?
1. It could do nothing. Then it would be your responsibility to wait
for all threads to finish running A, free its machine code, and then
recompile it with the new call. (You can do the recompile without
freeing the old code by calling
ExecutionEngine::recompileAndRelinkFunction(A), but that'll
permanently leak the old code.) If you destroy B while a thread is
still in A, its machine code gets freed, leaving you with a latent
crash.
2. It could compile C, and either replace B's machine code with a
jump to C, or replace all calls to B with calls to C. Aside from not
having the infrastructure to do this, it's not thread-safe:
http://llvm.org/PR5184.
3. ???

You'd have an extra option if machine code lifetimes weren't tied to
llvm::Function lifetimes, but I haven't spent the time to get that
working.

Since I didn't have a use for RAUW on a compiled function, I resisted
the temptation to guess at the right behavior and put in that assert.
If you think you know what the right behavior is, feel free to file a
bug asking for it.

You can work around this by using freeMachineCodeForFunction yourself
on the whole call tree, then using RAUW to replace the functions, and
then re-compiling them.

Or you can take Albert's advice to make all calls through function
pointers. This will be a bit slower, but should Just Work.

Hi Jeffrey,

2. It could compile C, and either replace B's machine code with a
jump to C, or replace all calls to B with calls to C. Aside from not
having the infrastructure to do this, it's not thread-safe:
http://llvm.org/PR5184.

if all calls were via a handle (i.e. load the function pointer out of
some memory location then jump to it), then you could compile C,
atomically replace the pointer-to-B with the pointer-to-C in the memory
location, and later free B using some kind of user-space read-copy-update
type logic. This could be managed transparently by the JIT (i.e. in the
IR you would have direct calls, that are implemented by a jump to a place
that loads the function pointer then calls it). If you don't want to use
handles, then there are also various possibilities for thread-safe code
patching (eg: the linux kernel does this kind of thing in various places),
but this is of course more complicated. That said, if the IR optimizers
have inlined your original function everywhere, then replacing the function
later won't have any effect...

Ciao,

Duncan.

Great! It just worked. I was a bit worried about using pointers to call functions because it’s a little too overwhelming in a big project, I think.

Just for the record, if the function code isn’t freed with freeMachineCodeForFunction, I get a segmentation fault during recompileAndRelinkFunction with this stack dump:
Running pass ‘X86 Machine Code Emitter’ on function ‘@do_print

I know no one should do this, but it’s good to know LLVM doesn’t allow you to leak (or it’s just a good side effect of something else).

Although this method can stop the whole program for quite some time, it doesn’t require reboot (which can be costy) and doesn’t have the constant cost of pointers (it allows me to choose when I can afford the cost of the change).

Thanks for the explanation. The code works just as wanted now.

Yep. I think the JIT should provide thread-safe code patching as a
service. The bug describes how to do it on x86 chips, but not PPC or
ARM. Because it requires the call sites to be aligned in particular
ways, or, on unsupported processors, to become indirect calls, I think
we'll want to attach an attribute to the call/invoke instruction
marking it as patchable. I'm definitely not going to get time to do
this before 2.7 though.

Well, it's not supposed to segfault. At worst, it should give you an
assertion error when you do something wrong (when it's compiled with
asserts, of course). Could you either file a bug, or send me the exact
code you were using with the command line you used to compile it
against svn head?

Thanks,
Jeffrey

Just updated the source and now I get the unreachable error again.

The JIT doesn’t know how to handle a RAUW on a value it has emitted.
UNREACHABLE executed at /home/conrado/engines/llvm/lib/ExecutionEngine/JIT/JITEmitter.cpp:1542!

I think that it’s not helpful now, but I can post the program, if you want me to.

Hm, I wonder if the error message for llvm_unreachable should change.
I think I remember a couple people focusing incorrectly on the
UNREACHABLE instead of the actual error message above it (which means
it's our fault, not theirs).

Miranda, this is pointing at the same problem you had before. You have
a function JIT-compiled, and you're trying to RAUW
(ReplaceAllUsesWith) it. You'll need to call
freeMachineCodeForFunction(x) before you call
x->replaceAllUsesWith(y).

And yes, putting your code (and a gdb-produced stack trace when
there's a crash) somewhere we can see it (for example, pastebin.com)
will always help us debug problems.

Hm, I wonder if the error message for llvm_unreachable should change.
I think I remember a couple people focusing incorrectly on the
UNREACHABLE instead of the actual error message above it (which means
it’s our fault, not theirs).

I think so. It really seems like it’s the programmers fault

Miranda, this is pointing at the same problem you had before. You have
a function JIT-compiled, and you’re trying to RAUW
(ReplaceAllUsesWith) it. You’ll need to call
freeMachineCodeForFunction(x) before you call
x->replaceAllUsesWith(y).

Actually, I didn’t call freeMachineCodeForFunction(x) (I know it’s a leak, but just to show it’s possible not to call).

And yes, putting your code (and a gdb-produced stack trace when
there’s a crash) somewhere we can see it (for example, pastebin.com)
will always help us debug problems.

The headers are a little mess, but it compiles OK.
Main file: http://miranda.pastebin.com/m10213c9c
File with function A (following your convention): http://miranda.pastebin.com/m2dbe2e64
File with function B: http://miranda.pastebin.com/m2dbe2e64
File with function C: http://miranda.pastebin.com/m32f19219
gdb trace: http://miranda.pastebin.com/m5c794b0d

When I call recompileAndRelinkFunction without calling freeMachineCodeForFunction for the same function, it crashes.

Just ask if more info is needed.