I am trying to use CloneFunction to clone every IR function in a program, give the cloned versions a prefix and call the clones from the original functions (redirect the calls).
Surprisingly, I see that after the LTO optimizations, the number of machine basic blocks in the two functions differ in some cases.
Is this reasonable at all, given that the two functions must be exactly the same in the IR level.
Or it might happen because of the interprocedural optimizations?
Have you verified that the old and new functions are identical at the LLVM IR level right before code generation in libLTO (you can easily modify libLTO to dump out the LLVM IR right before code generation). It might happen because of inter-procedural optimizations. If the old functions are no longer called, then inter-procedural constant propagation and other inter-procedural optimizations may change the new functions but not the old functions. I don’t know if the code generators do any inter-procedural analysis; my impression was that they did not. However, someone else needs to answer that question. Regards, John Criswell
Thank you for your reply.
The LLVM IRs are definitely the same since the cloning pass is the final IR pass in LTO. The calls to the clones always stay in the program as well.
The problem is that this happens only in large functions (with a couple of hundreds of basic blocks) and it happens very rarely (on average in 1 function out of every 1 thousand). So I wasn’t sure whether by dumping the MachineFunction, I will be able to figure out the problem. I can however, figure out which codegen pass causes this.
I also believe that the code generators don’t do any inter-procedural analysis, unless some meta data from IR level instructs them to do so.