Destination register needs to be valid after callee saved register restore when tail calling

Hello, Arnold.

Is there a way to indicate that the register the tail call
instruction uses as destination needs to be valid after the callee
saved registers have been restored? (some X86InstrInfo.td foo magic
maybe ?)

It's wrong way to do the things. Because in this case you either violate
the ABI for callee, or you're restricted to do tail call lowering only
for internal functions, making all stuff inpractical. . Only
call-clobbered registers can be used to store pointer to callee (I'd
suggest %ecx on x86/32, btw).

Or do i have to insert code into PEI::saveCalleeSavedRegisters to
detect that there is a tail called function that uses a callee saved
register and move it to another (EAX).

You shouldn't use call-saved registers at all. Only call-clobbered. It
seems, that you can use the trick similar to eh_return lowering (that
case is somehow special, because %eax and %edx should be preserved there
too). You can see it in TOT for x86 target (also, prologue/epilogue
emission code changed on TOT, you might want to check, whether your code
works for it).

PS: Feel free to contact me in case of any related questions. In fact, I
planned to start tail call lowering within next 2-3 weeks :slight_smile:

Hi Anton and Dale
first thanks for your answers.

Hello, Arnold.

Is there a way to indicate that the register the tail call
instruction uses as destination needs to be valid after the callee
saved registers have been restored? (some X86InstrInfo.td foo magic
maybe ?)

It's wrong way to do the things. Because in this case you either violate
the ABI for callee, or you're restricted to do tail call lowering only
for internal functions, making all stuff inpractical. . Only
call-clobbered registers can be used to store pointer to callee (I'd
suggest %ecx on x86/32, btw).

yes i cannot use the calleesaved registers for calling. i realized that. likely i did not express myself clearly. (sorry for that)
it is not me who is loading the target address to the callee saved register but the register allocator
decides to load the function pointer to esi because it assumes it is safe to do so. normally this is the case because
if you do a regular function call the code to restore registers would be inserted after the function call. so register esi is
still valid.
with the sentence i tried to express the question whether there is a way to persuade the code generator to use
another register to load (or move) the function pointer to (right before the callee saved register restore) but thinking a little further that's nonsense.

something like
let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1, noResults = 1,
  ifDestRegisterisCalleeSavedEmitAMoveToECXAndJumpToThat=1
   in
   def TAILJMPr : I<0xFF, MRM4r, (ops GR32:$dst), "jmp {*}$dst # TAIL CALL jmpr",
                  >;

Inserting a pseudo before your tail call that defines all the callee-
saved
registers should work. See FP_REG_KILL.

the trick of dale seems to work with the downside that all registers are safed (and in llvm-2.0 the epilogue inserter ignoring
the isTerminator instruction - note to myself: should move to trunk soon!)

another option would be to do the move from the register holding the function pointer to a register like ECX myself (as you say),

Or do i have to insert code into PEI::saveCalleeSavedRegisters to
detect that there is a tail called function that uses a callee saved
register and move it to another (EAX).

You shouldn't use call-saved registers at all. Only call-clobbered. It
seems, that you can use the trick similar to eh_return lowering (that
case is somehow special, because %eax and %edx should be preserved there
too). You can see it in TOT for x86 target (also, prologue/epilogue
emission code changed on TOT, you might want to check, whether your code
works for it).

Sorry i was under the misconception that eax is call-clobbered too since it contains the function result.

TOT means trunk of today (funny because in german, my native language it means death)?

So what i will be trying then is to emit a copytoreg from the virtual register holding the
function pointer to ecx before the tailcall node.

So where i approximately had this before (assuming that RetNode.getOp(1) is not a TargetGlobalAddress or the like)

SDOperand OpsTailCall = {AdjStackChain, RetNode.getOperand(1), RetNode.getOperand(2)};
RetNode = DCI.DAG.getNode(X86ISD::TAILCALL, TCVTs, OpsTailCall,3);

would then be replaced by

Chain = DAG.getCopyToReg(AdjStackChain, X86::ECX, RetNode.getOperand(1));
SDOperand OpsTailCall = {Chain,DAG.getRegister(X86::ECX, getPointerTy())), RetNode.getOperand(2)};
RetNode = DCI.DAG.getNode(X86ISD::TAILCALL, TCVTs, OpsTailCall, 3);

the downside here is that ECX is no longer free for passing function arguments. (i am using the x86_fastcall semantics at the moment with first
two arguments stored in ecx,edx)

does that sound sane?

yes i will try against the trunk soon when i am in a masochistic deathly mood ;). maybe tonight.
and sorry if i am bothering you with questions whose answer should be obvious. i am really a total newbie greenhorn :slight_smile:

regards arnold

If you would like to give the RA more latitude in choosing the register around other allocations you could create a register class corresponding to registers which are call-clobbered, and therefore safe to use to hold the function address for tail calls. Then instead of a normal CopyToReg you would use a pseudo-move instruction similar to MOV32to32_ to ensure that the function address is in a safe register.

Right, and that's a big downside.

Many targets are going to have this problem, maybe we should be looking for something generic. I don't think hardcoding a particular register is the right way to do this.

Hi Anton and Dale
first thanks for your answers.

Hello, Arnold.

Is there a way to indicate that the register the tail call
instruction uses as destination needs to be valid after the callee
saved registers have been restored? (some X86InstrInfo.td foo magic
maybe ?)

It's wrong way to do the things. Because in this case you either
violate
the ABI for callee, or you're restricted to do tail call lowering only
for internal functions, making all stuff inpractical. . Only
call-clobbered registers can be used to store pointer to callee (I'd
suggest %ecx on x86/32, btw).

yes i cannot use the calleesaved registers for calling. i realized
that. likely i did not express myself clearly. (sorry for that)
it is not me who is loading the target address to the callee saved
register but the register allocator
decides to load the function pointer to esi because it assumes it is
safe to do so. normally this is the case because
if you do a regular function call the code to restore registers would
be inserted after the function call. so register esi is
still valid.

Hmm. Seems like somthing else is wrong. The register allocator is free to use a callee saved register to store the function ptr provided it's not live. It's likely your lowering code isn't setting up things properly.

Anyway, I am pretty sure target specific dag combiner isn't the right way to implement tail call. We wanted something more general. Since there seems to be interests in this. Perhaps this is a good time to formula a plan. I'll talk to Chris about this.

Thx

Evan

refrain i will from the dark side of following the path of dagcombine,
follow the wise advice of llvm master anton and custom lower the tail call i shall,
follow the force of custom lowering the 'tail call return' i will until a greater plan is decided by the high council

thanks for your help,
regards young llvm padawan arnold :wink: