Purpose of glue nodes in call lowering

Unfortunately our backend is not yet in experimental. I think AArch64 LowerCall is representative.

// Build a sequence of copy-to-reg nodes chained together with token chain
// and flag operands which copy the outgoing args into the appropriate regs.
SDValue InFlag;
for (auto &RegToPass : RegsToPass) {
Chain = DAG.getCopyToReg(Chain, DL, RegToPass.first,
RegToPass.second, InFlag);
InFlag = Chain.getValue(1);
}

I have two questions about this. Primarily, what is the purpose of threading InFlag through the copyToReg nodes? It seems to be a common choice across multiple backends.

My understanding is that glue nodes are used to “stick” other nodes together, increasing the probability that they end up near each other at scheduling. I think that would make them optional in the sense that codegen would be correct in their absence. If so, I would like to remove them from our backend as the scheduler will probably do a reasonable job without hints.

The second query is whether each copy needs to be dependent on the previous argument. I expected each copy to take the same chain node, with the the chain nodes returned by getCopyToReg merged using a TokenFactor. The goal would be to allow the register allocator / scheduler to rearrange the instructions used to pass arguments.

If each copy can indeed be passed the same chain node to indicate that they are independent of one another then I should be able to get better codegen out of the register allocator.

Thanks!

Jon

Hi Jon,

My understanding is that glue nodes are used to "stick" other nodes together, increasing the probability that they end up near each other at scheduling. I think that would make them optional in the sense that codegen would be correct in their absence.

I think the main purpose of the Glue is to represent the data
dependency between the register copies and the call instruction
itself. Most other nodes will end up using virtual registers and so
it'd be harmless if they were scheduled in the middle, but that's not
really something I'd want to rely on.

The second query is whether each copy needs to be dependent on the previous argument. I expected each copy to take the same chain node, with the the chain nodes returned by getCopyToReg merged using a TokenFactor.

That probably is valid, though I wouldn't be surprised if the Glue
overrode any attempt to exploit that.

The goal would be to allow the register allocator / scheduler to rearrange the instructions used to pass arguments.

I'd expect them to be pretty free anyway.

Cheers.

Tim.