Function call replacement

Hi all,

I am trying to replace calls to a function, say foo() with calls to my_foo(). In the IR file, there is no call to foo(). However, foo() will be called during execution. I can’t use the conventional way to do it (i.e. locate the call instruction, create a call site and then create a call to my_foo(), then replace all uses with the return value of my_foo()), because you can’t locate the call instruction. So I tried to swap the name of foo() and my_foo(). However, this only results in the original function with the new name being called. I know I must have missed some steps, resulting in the reference to the original function still existing after the renaming. The question is, what is the missing step? Thank you so much for reading my description. I hope I make my question clear.

Best,
Jason

Hi all,

I am trying to replace calls to a function, say foo() with calls to my_foo(). In the IR file, there is no call to foo(). However, foo() will be called during execution.

Can’t say I’m following here ^ there is no call to foo() but it is called during execution? Is it called indirectly (eg: void (*x)() = foo; x(); )? is the call in another module/not this IR file?

One example can be a constructor of a class, say class Foo, in C++. When you have a vector of Foo instances, the constructor will be called by the library function of vector. There are many such functions, that you can’t know they are actually executed until the runtime.

Beside that, I have done some research today. This has to be done by modifying the class’s vtable. I have manually modified the IR file and it works. But I just can’t find an API function of LLVM to modify the vtable of the class.

One example can be a constructor of a class, say class Foo, in C++. When you have a vector of Foo instances, the constructor will be called by the library function of vector.

Even in this example, you’d find the call to Foo’s constructor in the module somewhere. The “library function of vector” would be instantiated in this same translation unit and emitted into the LLVM IR module.

If the call is really coming from outside the module there’s not much you can do to effect it.

There are many such functions, that you can’t know they are actually executed until the runtime.

Beside that, I have done some research today. This has to be done by modifying the class’s vtable.

If the function is virtual, and if you have access to the vtable (perhaps it’s in another module/translation unit too).

I have manually modified the IR file and it works. But I just can’t find an API function of LLVM to modify the vtable of the class.

LLVM IR doesn’t know much about vtables - it’s “just” another array of function pointers as far as LLVM is concerned. But you could modify that array if you like.

Thanks for the prompt reply! There is this function: _ZN7threads11WorkerGroup3RunEv, demangled as threads::WorkerGroup::Run(), in one of the benchmarks from PARSEC. I assume this is similar to the run method of a thread class in Java, that is designed to be invoked by some library function during runtime.

I do notice for the above mentioned function, use_empty() returns false. However, creating a call site and then trying to locate the call instruction is futile. That being said, I am not sure if we can use the use_iterator returned by user_begin() to do something.

Can you give some hint about how to locate and change that array of function pointers(namely, the vtable)?

Thanks a lot,
Jason

OK. I managed to figure it out. For ppl that might have the same demand, here is the code:

for (auto ui = foo->user_begin(); ui != foo->user_end(); ++ui) {

Value *v = *ui;
v->replaceAllUsesWith(ConstantExpr::getBitCast(new_foo, i8p_t));
}

Looking back, it’s that simple. But it took me quite a while to figure out the right way to do it. Several other approaches were also tried, but all failed eventually. One of them is iterating all vtable global variables. But I can’t figure out how to iterate over the function pointers and then get the cast-from type of the elements. Besides, this seems less efficient than the approach above. But still, an interesting exploration experience.