[RFC] Splitting init.trampoline into init.trampoline and adjust.trampoline

Hi!

Attached set of patches splits llvm.init.trampoline into an "init"
phase and an "adjust" phase, as discussed on the "Go on dragonegg"
thread.

Thanks!

0001-Split-intrinsics-and-DAG-nodes.patch (8.6 KB)

0002-Architecture-specific-code.patch (11.2 KB)

0003-Fix-the-instcombine-pass.patch (3.75 KB)

0004-Modify-test-cases.patch (10.6 KB)

Hi Sanjoy,

Attached set of patches splits llvm.init.trampoline into an "init"
phase and an "adjust" phase, as discussed on the "Go on dragonegg"
thread.

thanks for doing this. The patches look good, though the decomposition into
individual patches is not that great (since things won't always work, in fact
not even compile I think, with not all patches applied). Some minor comments
below.

--- a/include/llvm/CodeGen/ISDOpcodes.h
+++ b/include/llvm/CodeGen/ISDOpcodes.h
@@ -566,14 +566,18 @@ namespace ISD {
     // HANDLENODE node - Used as a handle for various purposes.
     HANDLENODE,

- // TRAMPOLINE - This corresponds to the init_trampoline intrinsic.
- // It takes as input a token chain, the pointer to the trampoline,
- // the pointer to the nested function, the pointer to pass for the
- // 'nest' parameter, a SRCVALUE for the trampoline and another for
- // the nested function (allowing targets to access the original
- // Function*). It produces the result of the intrinsic and a token
- // chain as output.
- TRAMPOLINE,
+ // INIT_TRAMPOLINE - This corresponds to the init_trampoline intrinsic. It
+ // takes as input a token chain, the pointer to the trampoline, the pointer
+ // to the nested function, the pointer to pass for the 'nest' parameter, a
+ // SRCVALUE for the trampoline and another for the nested function (allowing
+ // targets to access the original Function*). It produces a token chain as
+ // output.
+ INIT_TRAMPOLINE,
+
+ // ADJUST_TRAMPOLINE - This corresponds to the adjust_trampoline intrinsic.
+ // It takes a pointer to the trampoline and produces a new pointer to the
+ // intrinsic with platform-specific adjustments applied.

"a new pointer to the intrinsic" -> "a new pointer"
After all, it doesn't produce a pointer to an intrinsic :slight_smile:

+ ADJUST_TRAMPOLINE,

--- a/include/llvm/Intrinsics.td
+++ b/include/llvm/Intrinsics.td
@@ -344,10 +344,12 @@ def int_annotation : Intrinsic<[llvm_anyint_ty],

//===------------------------ Trampoline Intrinsics -----------------------===//
//
-def int_init_trampoline : Intrinsic<[llvm_ptr_ty],
+def int_init_trampoline : Intrinsic<[],
                                     [llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
- [IntrReadWriteArgMem]>,
- GCCBuiltin<"__builtin_init_trampoline">;
+ [IntrReadWriteArgMem]>;
+
+def int_adjust_trampoline : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty],
+ [IntrReadMem]>;

Since these both now map exactly to the corresponding GCC builtins, there should
be a GCCBuiltin entry for each of them. Note that the previous version of
init.trampoline mapped to the llvm-gcc builtin, which had been modified from the
mainline GCC intrinsic. The new intrinsic won't map to llvm-gcc's any more, but
it will map to dragonegg's (i.e. GCC's); since llvm-gcc is deprecated I think
that it is OK.

--- a/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4947,12 +4947,16 @@ SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I, unsigned Intrinsic) {
     Ops[4] = DAG.getSrcValue(I.getArgOperand(0));
     Ops[5] = DAG.getSrcValue(F);

- Res = DAG.getNode(ISD::TRAMPOLINE, dl,
- DAG.getVTList(TLI.getPointerTy(), MVT::Other),
- Ops, 6);
+ Res = DAG.getNode(ISD::INIT_TRAMPOLINE, dl, MVT::Other, Ops, 6);

- setValue(&I, Res);
- DAG.setRoot(Res.getValue(1));
+ DAG.setRoot(Res);
+ return 0;
+ }
+ case Intrinsic::adjust_trampoline: {
+ setValue(&I, DAG.getNode(ISD::ADJUST_TRAMPOLINE, dl,
+ TLI.getPointerTy(),
+ getValue(I.getArgOperand(0)),
+ getValue(I.getArgOperand(1))));

Why is this taking two arguments? I don't think "getValue(I.getArgOperand(1))"
should be there.

     return 0;
   }
   case Intrinsic::gcroot:

--- a/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -211,6 +211,8 @@ PPCTargetLowering::PPCTargetLowering(PPCTargetMachine &TM)
   setOperationAction(ISD::TRAP, MVT::Other, Legal);

   // TRAMPOLINE is custom lowered.
+ setOperationAction(ISD::INIT_TRAMPOLINE, MVT::Other, Custom);
+ setOperationAction(ISD::ADJUST_TRAMPOLINE, MVT::Other, Custom);

   // VASTART needs to be custom lowered to use the VarArgsFrameIndex
   setOperationAction(ISD::VASTART , MVT::Other, Custom);
@@ -1369,8 +1371,13 @@ SDValue PPCTargetLowering::LowerVAARG(SDValue Op, SelectionDAG &DAG,
   return DAG.getLoad(VT, dl, InChain, Result, MachinePointerInfo(), false, false, 0);
}

-SDValue PPCTargetLowering::LowerTRAMPOLINE(SDValue Op,
- SelectionDAG &DAG) const {
+SDValue PPCTargetLowering::LowerADJUST_TRAMPOLINE(SDValue Op,
+ SelectionDAG &DAG) const {
+ return Op.getOperand(0);
+}
+
+SDValue PPCTargetLowering::LowerINIT_TRAMPOLINE(SDValue Op,
+ SelectionDAG &DAG) const {
   SDValue Chain = Op.getOperand(0);
   SDValue Trmp = Op.getOperand(1); // trampoline
   SDValue FPtr = Op.getOperand(2); // nested function
@@ -1399,16 +1406,13 @@ SDValue PPCTargetLowering::LowerTRAMPOLINE(SDValue Op,

   // Lower to a call to __trampoline_setup(Trmp, TrampSize, FPtr, ctx_reg)
   std::pair<SDValue, SDValue> CallResult =
- LowerCallTo(Chain, Op.getValueType().getTypeForEVT(*DAG.getContext()),
+ LowerCallTo(Chain, Type::getVoidTy(*DAG.getContext()),

I don't see why you are changing the return type here. It is the return type of
"__trampoline_setup" (whatever that is) and that didn't change.

                 false, false, false, false, 0, CallingConv::C, false,
                 /*isReturnValueUsed=*/true,
                 DAG.getExternalSymbol("__trampoline_setup", PtrVT),
                 Args, DAG, dl);

- SDValue Ops[] =
- { CallResult.first, CallResult.second };
-
- return DAG.getMergeValues(Ops, 2, dl);
+ return CallResult.second;
}

--- a/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -821,6 +821,42 @@ Instruction *InstCombiner::tryOptimizeCall(CallInst *CI, const TargetData *TD) {
   return Simplifier.NewInstruction;
}

+// Given a call to llvm.adjust.trampoline, find the corresponding call to
+// llvm.init.trampoline, conservatively.
+static IntrinsicInst *FindInitTrampoline(Value *Callee) {
+ BitCastInst *BCInst = dyn_cast<BitCastInst>(Callee);
+ if (BCInst == NULL)
+ return NULL;

It may be better to just do:
   Callee = Callee->stripPointerCasts();
Then you would use Callee rather than BCOp.

+
+ Value *BCOp = BCInst->getOperand(0);
+ IntrinsicInst *AdjustTramp = cast<IntrinsicInst>(BCOp);

This will crash if it isn't an intrinsic. Use dyn_cast not cast.

+ if (AdjustTramp == NULL)
+ return NULL;

You forgot to check that this intrinsic is the adjust.trampoline intrinsic.
Also, the LLVM style is to use 0 rather than NULL in this kind of situation.

+ Value *TrampMem = AdjustTramp->getOperand(0);

Maybe you should append a ->stripPointerCasts() to the line defining TrampMem,
since it may have had to be bitcast to match the type of the adjust pointer
argument. While the init.trampoline takes the same type as adjust.trampoline,
it might have it's own bitcast. That said, if it did probably GVN would have
unified them.

+ // Now look at the uses of TrampMem and see if we can find the InitTramp
+ // instruction. The conservative thing to check would be if the
+ // TrampMem has exactly two uses, the adjust.trampoline and the
+ // init.trampoline calls.

What if the adjust comes before the init, and the result of the adjust is called
before the init occurs (i.e. the callsite is before the init)?

+
+ if (TrampMem->hasNUses(2)) {

If you stripPointerCasts when defining TrampMem then this is no longer correct.
Also, I think you should do
   if (!TrampMem->hasNUses(2))
     return 0;

+ Value::use_iterator I = TrampMem->use_begin();
+ if (*I == AdjustTramp)
+ I++;
+
+ // Now I is at the (only) use that is not adjust.tramp. Check if this is a
+ // call to init.intrinsic
+
+ if (IntrinsicInst *In = dyn_cast<IntrinsicInst>(*I)) {
+ if (In->getIntrinsicID() == Intrinsic::init_trampoline)
+ return In;
+ }

There was no point in having the curly brackets { }.

+ }
+
+ return NULL;

-> return 0;

+}
+
// visitCallSite - Improvements for call and invoke instructions.
//
Instruction *InstCombiner::visitCallSite(CallSite CS) {
@@ -880,10 +916,8 @@ Instruction *InstCombiner::visitCallSite(CallSite CS) {
     return EraseInstFromFunction(*CS.getInstruction());
   }

- if (BitCastInst *BC = dyn_cast<BitCastInst>(Callee))
- if (IntrinsicInst *In = dyn_cast<IntrinsicInst>(BC->getOperand(0)))
- if (In->getIntrinsicID() == Intrinsic::init_trampoline)
- return transformCallThroughTrampoline(CS);
+ if (FindInitTrampoline(Callee))
+ return transformCallThroughTrampoline(CS);

Done like this you have to call FindInitTrampoline twice, once here and once in
transformCallThroughTrampoline. How about passing the FindInitTrampoline result
as an extra argument to transformCallThroughTrampoline.

Ciao, Duncan.

Hi!

Thanks for the review. Have attached new set of patches.

What if the adjust comes before the init, and the result of the adjust
is called
before the init occurs (i.e. the callsite is before the init)?

Could not understand this. :slight_smile: Perhaps you could give me some example
IR where this happens?

I've not added any new documentation yet. I'm not sure about what the
documented semantics of llvm.adjust.trampoline should be since LLVM
just returns the first argument there. How restrictive do I make it?

Thanks!

0001-New-trampoline-intrinsics-for-LLVM.patch (16.4 KB)

0002-Fix-the-InstCombine-pass.patch (6.04 KB)

0003-AutoUpgrade-for-llvm.init.trampoline.patch (3.21 KB)

0004-Test-cases.patch (7.76 KB)