Hi lx, Philip
I've seen an instcombine which helps with this situation. It fires when
the function types on both sides of the bitcast have the same number of
operands and compatible types. It then adds bitcasts on the arguments and
removes the one on the called function.
It indeed does, InstCombiner::transformConstExprCastCall.
I don't have IR to hand, but it would be worth passing your IR through
instcombine to see if that helps you.
The following should be a sufficiently workable example of what we would
hope to transform.
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i32 @g(i32* %a) {
entry:
%call = tail call i32 bitcast (i32 (i64)* @f to i32 (i32*)*)(i32* %a)
ret i32 %call
}
declare i32 @f(i64)
define i32 @h(i64 %a) {
entry:
%call = tail call i32 bitcast (i32 (i32*)* @g to i32 (i64)*)(i64 %a)
ret i32 %call
}
The idea of improving the inliner is also great, but you may find that
it's needed for cases other than this one if i'm right about the
instcombine.
Sadly, the combine fails because InstCombine
queries CastInst::isBitCastable to determine the castable-ness of the
parameter type and the argument type. It isn't bitcastable though, it's
ptrtoint/inttoptr castable.
The following patch opens up the optimization:
--- a/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -1456,7 +1456,7 @@ bool
InstCombiner::transformConstExprCastCall(CallSite CS) {
Type *ParamTy = FT->getParamType(i);
Type *ActTy = (*AI)->getType();
- if (!CastInst::isBitCastable(ActTy, ParamTy))
+ if (!CastInst::isBitOrNoopPointerCastable(ActTy, ParamTy, DL))
return false; // Cannot transform this parameter value.
if (AttrBuilder(CallerPAL.getParamAttributes(i + 1), i + 1).
@@ -1551,7 +1551,7 @@ bool
InstCombiner::transformConstExprCastCall(CallSite CS) {
if ((*AI)->getType() == ParamTy) {
Args.push_back(*AI);
} else {
- Args.push_back(Builder->CreateBitCast(*AI, ParamTy));
+ Args.push_back(Builder->CreateBitOrPointerCast(*AI, ParamTy));
}
// Add any parameter attributes.
Running opt -instcombine -inline -instcombine with this patch results in:
define i32 @g(i32* %a) {
entry:
%0 = ptrtoint i32* %a to i64
%call = tail call i32 @f(i64 %0)
ret i32 %call
}
declare i32 @f(i64)
define i32 @h(i64 %a) {
entry:
%call.i = tail call i32 @f(i64 %a)
ret i32 %call.i
}