RFC: PowerPC tail call optimization patch

Hello Dale,

this is an updated version of the tail call optimization patch for
powerpc. could you have a look at it?

i added code to support ppc64 (untested, will try to get access to
ppc64 on a friend's machine).
incorporated evan's formatting suggestions. :wink:

will run another round of testing (llvm-test) on my powerpc g4/800
when i get the okay to commit. testing on this machine takes forever
:(.

okay to commit?

regards arnold

r49787-ppc-1.patch (63.4 KB)

More nitpicks:

+ if (!isVarArg && !isPPC64) {
+ // Non-varargs Altivec parameters go after all the non-Altivec
+ // parameters; handle those later so we know how much padding we need.
+ nAltivecParamsAtEnd++;
+ continue;
+ } else {
+ // Varargs and 64-bit Altivec parameters are padded to 16 byte boundary.
+ NumBytes = ((NumBytes+15)/16)*16;
+ }

=>

+ if (!isVarArg && !isPPC64) {
+ // Non-varargs Altivec parameters go after all the non-Altivec
+ // parameters; handle those later so we know how much padding we need.
+ nAltivecParamsAtEnd++;
+ continue;
+ }
+ // Varargs and 64-bit Altivec parameters are padded to 16 byte boundary.
+ NumBytes = ((NumBytes+15)/16)*16;

No need for else here. :slight_smile:

+ int SPDiff = 0;

More nitpicks:
...
No need for else here. :slight_smile:

Done

SPDiff = (int)CallerMinReservedArea - (int)ParamSize;

Just change last statement to
int SPDiff = (int)...

Done

+bool
+PPCTargetLowering::IsEligibleForTailCallOptimization(SDOperand Call,
+ SDOperand Ret,
+ SelectionDAG&
DAG) const {
+ // Variable argument functions
+ // are not supported.

Why two separate lines?

there once was some text ... done.

So many blank lines... :slight_smile:

done :slight_smile:

you wouldn't have by any chance an evan awk script that checks for my
stupid fomatting mistakes :slight_smile:

+ // Check whether CALL node immediatly preceeds the RET node and
whether the
+ // return uses the result of the node or is a void return.
+ unsigned NumOps = Ret.getNumOperands();
+ if ((NumOps == 1 &&
+ (Ret.getOperand(0) == SDOperand(Call.Val,1) ||
+ Ret.getOperand(0) == SDOperand(Call.Val,0))) ||
+ (NumOps > 1 &&
+ Ret.getOperand(0) == SDOperand(Call.Val,Call.Val-
  >getNumValues()-1) &&
+ Ret.getOperand(1) == SDOperand(Call.Val,0))) {
...

This function seems to be very similar to
X86TargetLowering::IsEligibleForTailCallOptimization. Is it possible
to make it into a common utility function (or at least the target
independent part)?

The target independent part can be refactored. The whole function not
because e.g ppc does not support yet fully support PIC.

Also
+static bool IsPossiblyOverwrittenArgumentOfTailCall(SDOperand Op,
+ MachineFrameInfo
* MFI) {

I wonder if these can be moved to SelectionDAGISel? So it's handled
during the sdisel phase of call lowering?

I am not sure what you mean by 'these' here. The function above
(IsPossiblyOverwritten...) is used in
LowerCALL()
  forall arguments
    CalculateTailCallArgDest()
to determine whether an argument needs to be evacuated to a virtual
register to prevent being overwritten.

I am not sure what you are suggesting ? Storing this information
(ispossiblyoverwritten) in the argument nodes?
as ARG_FLAGSSDNode as a new ArgFlagsTy?

You are using 'these'. I believe: the other functions e.g
CalculateTailCallArgDest should not be moved to SelectionDAGISel
because they involve target knowledge (where to place the argument)
and would involve moving the whole argument lowering to
SelectionDAGISel.

      if (isPPC64 && Arg.getValueType() == MVT::i32) {
        // FIXME: Should this use ANY_EXTEND if neither sext nor zext?
@@ -1946,7 +2285,13 @@ SDOperand
PPCTargetLowering::LowerCALL(SDOperand Op, SelectionDAG &DAG,
        if (GPR_idx != NumGPRs) {
          RegsToPass.push_back(std::make_pair(GPR[GPR_idx++], Arg));
        } else {
- MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff, NULL,
0));
+ if (!isTailCall)
+ MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff,
NULL, 0));
+ // Calculate and remember argument location.
+ else
+ CalculateTailCallArgDest(DAG, MF, isPPC64, Arg, SPDiff,
ArgOffset,
+ TailCallArguments);
+
          inMem = true;
        }
        if (inMem || isMachoABI) {
@@ -1994,7 +2339,13 @@ SDOperand
PPCTargetLowering::LowerCALL(SDOperand Op, SelectionDAG &DAG,
            }
          }
        } else {
- MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff, NULL,
0));
+ if (!isTailCall)
+ MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff,
NULL, 0));
+ // Calculate and remember argument location.
+ else
+ CalculateTailCallArgDest(DAG, MF, isPPC64, Arg, SPDiff,
ArgOffset,
+ TailCallArguments);
+
...
- // We are emitting Altivec params in order.
- PtrOff = DAG.getNode(ISD::ADD, PtrVT, StackPtr,
- DAG.getConstant(ArgOffset, PtrVT));
- SDOperand Store = DAG.getStore(Chain, Arg, PtrOff, NULL, 0);
- MemOpChains.push_back(Store);
+ if (!isTailCall) {
+ // We are emitting Altivec params in order.
+ PtrOff = DAG.getNode(ISD::ADD, PtrVT, StackPtr,
+ DAG.getConstant(ArgOffset, PtrVT));
+ SDOperand Store = DAG.getStore(Chain, Arg, PtrOff, NULL, 0);
+ MemOpChains.push_back(Store);
+ } else {
+ CalculateTailCallArgDest(DAG, MF, isPPC64, Arg, SPDiff,
ArgOffset,
+ TailCallArguments);
+ }

There seem to be quite a bit of duplicated code. Can you refactor?

Will try to refactor into one function performing above functionality
that is called instead.

+/// GetPossiblePreceedingTailCall - Get preceeding X86ISD::TAILCALL
node if it
+/// exists skip possible ISD:TokenFactor.

Comment bug.

:slight_smile: Done

More nitpicks:
...
No need for else here. :slight_smile:

Done

SPDiff = (int)CallerMinReservedArea - (int)ParamSize;

Just change last statement to
int SPDiff = (int)...

Done

+bool
+PPCTargetLowering::IsEligibleForTailCallOptimization(SDOperand Call,
+ SDOperand Ret,
+ SelectionDAG&
DAG) const {
+ // Variable argument functions
+ // are not supported.

Why two separate lines?

there once was some text ... done.

So many blank lines... :slight_smile:

done :slight_smile:

you wouldn't have by any chance an evan awk script that checks for my
stupid fomatting mistakes :slight_smile:

:slight_smile:

+ // Check whether CALL node immediatly preceeds the RET node and
whether the
+ // return uses the result of the node or is a void return.
+ unsigned NumOps = Ret.getNumOperands();
+ if ((NumOps == 1 &&
+ (Ret.getOperand(0) == SDOperand(Call.Val,1) ||
+ Ret.getOperand(0) == SDOperand(Call.Val,0))) ||
+ (NumOps > 1 &&
+ Ret.getOperand(0) == SDOperand(Call.Val,Call.Val-

getNumValues()-1) &&

+ Ret.getOperand(1) == SDOperand(Call.Val,0))) {
...

This function seems to be very similar to
X86TargetLowering::IsEligibleForTailCallOptimization. Is it possible
to make it into a common utility function (or at least the target
independent part)?

The target independent part can be refactored. The whole function not
because e.g ppc does not support yet fully support PIC.

That's fine. Please break it into two parts and move the target independent part out.

Also
+static bool IsPossiblyOverwrittenArgumentOfTailCall(SDOperand Op,
+ MachineFrameInfo
* MFI) {

I wonder if these can be moved to SelectionDAGISel? So it's handled
during the sdisel phase of call lowering?

I am not sure what you mean by 'these' here. The function above

IsEligibleForTailCallOptimization, IsPossiblyOverwrittenArgumentOfTailCall, etc. Basically anything that's shared between all the targets.

(IsPossiblyOverwritten...) is used in
LowerCALL()
forall arguments
   CalculateTailCallArgDest()
to determine whether an argument needs to be evacuated to a virtual
register to prevent being overwritten.

I am not sure what you are suggesting ? Storing this information
(ispossiblyoverwritten) in the argument nodes?
as ARG_FLAGSSDNode as a new ArgFlagsTy?

You are using 'these'. I believe: the other functions e.g
CalculateTailCallArgDest should not be moved to SelectionDAGISel
because they involve target knowledge (where to place the argument)
and would involve moving the whole argument lowering to
SelectionDAGISel.

I am not sure. Just thinking aloud. It would be nice if all of the common code can be handled by SelectionDAGISel. For example, it seems possible for SDIsel to determine which operands can be overwritten and issued the copies there?

     if (isPPC64 && Arg.getValueType() == MVT::i32) {
       // FIXME: Should this use ANY_EXTEND if neither sext nor zext?
@@ -1946,7 +2285,13 @@ SDOperand
PPCTargetLowering::LowerCALL(SDOperand Op, SelectionDAG &DAG,
       if (GPR_idx != NumGPRs) {
         RegsToPass.push_back(std::make_pair(GPR[GPR_idx++], Arg));
       } else {
- MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff, NULL,
0));
+ if (!isTailCall)
+ MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff,
NULL, 0));
+ // Calculate and remember argument location.
+ else
+ CalculateTailCallArgDest(DAG, MF, isPPC64, Arg, SPDiff,
ArgOffset,
+ TailCallArguments);
+
         inMem = true;
       }
       if (inMem || isMachoABI) {
@@ -1994,7 +2339,13 @@ SDOperand
PPCTargetLowering::LowerCALL(SDOperand Op, SelectionDAG &DAG,
           }
         }
       } else {
- MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff, NULL,
0));
+ if (!isTailCall)
+ MemOpChains.push_back(DAG.getStore(Chain, Arg, PtrOff,
NULL, 0));
+ // Calculate and remember argument location.
+ else
+ CalculateTailCallArgDest(DAG, MF, isPPC64, Arg, SPDiff,
ArgOffset,
+ TailCallArguments);
+
...
- // We are emitting Altivec params in order.
- PtrOff = DAG.getNode(ISD::ADD, PtrVT, StackPtr,
- DAG.getConstant(ArgOffset, PtrVT));
- SDOperand Store = DAG.getStore(Chain, Arg, PtrOff, NULL, 0);
- MemOpChains.push_back(Store);
+ if (!isTailCall) {
+ // We are emitting Altivec params in order.
+ PtrOff = DAG.getNode(ISD::ADD, PtrVT, StackPtr,
+ DAG.getConstant(ArgOffset, PtrVT));
+ SDOperand Store = DAG.getStore(Chain, Arg, PtrOff, NULL, 0);
+ MemOpChains.push_back(Store);
+ } else {
+ CalculateTailCallArgDest(DAG, MF, isPPC64, Arg, SPDiff,
ArgOffset,
+ TailCallArguments);
+ }

There seem to be quite a bit of duplicated code. Can you refactor?

Will try to refactor into one function performing above functionality
that is called instead.

Thanks.

Evan

Another round :slight_smile:

I'll post the patch to llvm-commits cause i guess people might get
annoyed by my series of patches :).

.

+PPCTargetLowering::IsEligibleForTailCallOptimization(SDOperand Call,
...

That's fine. Please break it into two parts and move the target
independent part out.

Done - see CheckTailCallReturnConstraints() in TargetLowering. The
rest of the function is still target dependent. PowerPC currently can
not tail call optimize byval functions, no pic/got code for non-local
calls. On x86 no support for pic/got code on x86-64 for non-local
calls.

I am not sure. Just thinking aloud. It would be nice if all of the
common code can be handled by SelectionDAGISel. For example, it seems
possible for SDIsel to determine which operands can be overwritten and
issued the copies there?

Yes the common code between x86/ppc could be moved to
SelectionDAGISel. See CheckDAGForTailCallsAndFixThem().

There seem to be quite a bit of duplicated code. Can you refactor?

Created LowerMemOpCallTo.

-arnold

The patch is in
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20080421/061548.html
.

Okay to commit :slight_smile:

Thanks! Code looks much cleaner.

The patch is in
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20080421/061548.html
.

Okay to commit :slight_smile:

Some comments. :slight_smile:

+static bool IsPossiblyOverwrittenArgumentOfTailCall(SDOperand Op,
+ MachineFrameInfo * MFI) {
...

+ (FrameIdxNode = dyn_cast<FrameIndexSDNode>(Op.getOperand(1))) &&
+ (MFI->isFixedObjectIndex((FrameIdx = FrameIdxNode- >getIndex()))) &&
+ (MFI->getObjectOffset(FrameIdx) >= 0))

I suggest to further refactor this. Perhaps add a utility static function in SelectionDAG called isFixedFrameObject. Then you can *not* embed the FrameIdx assignment in the test. :slight_smile:

Obligatory format nitpick below. :slight_smile:

  static void CheckDAGForTailCallsAndFixThem(SelectionDAG &DAG,
...
+ if (isMarkedTailCall)
+ if (Ret==NULL ||
+ !TLI.IsEligibleForTailCallOptimization(OpCall, OpRet, DAG)) {

How about?
     if (!isMarkedTailCall)
       continue;
     if (Ret == NULL ||
         !TLI.IsEligibleForTailCallOptimization(OpCall, OpRet, DAG)) {

This gets rid of one level of nesting.

Please commit after fixing these (and testing). Thanks.

Evan