Semantics of COPY instructions (Specifically, in LICM)

COPY instructions, as a generic opcode, has plenty of special behaviors.

In LICM, there exists a function that determines if an instruction has a high latency for a particular def. The for loop over all uses of that def starts with:

if (UseMI.isCopyLike())
 continue;

I’m curious if such an assumption is based on a standard semantic definition that COPY must be low-latency.

Take for example the Texas Instruments C6000 DSP. It has two sides: A and B. If a COPY were to take a B-side register and move it to and A-side register, it takes an extra cycle over a COPY from A to A or B to B. I’d imagine that there are plenty of other targets for examples where a COPY by intuition would not be ‘free’.

I could, potentially, transform cross-register file COPY instructions into another instruction if necessary. Does that sound reasonable?

Semantics don’t say anything about performance characteristics. For the scheduler info you can set different latencies for cross register class copies. There are also existing TRI hooks to check for cross register class copies

I apologize for the strange choice of words. The semantics are quite simple: Register source moved to register destination. That having been said, it looks like LLVM wants to treat copies as being effectively free always. This is indicated by its presence in isTransient, and the uses of isCopy/isCopyLike, specifically in places like:

/// Return true if the instruction is marked "cheap" or the operand latency
/// between its def and a use is one or less.
bool MachineLICMBase::IsCheapInstruction(MachineInstr &MI) const {
  if (TII->isAsCheapAsAMove(MI) || MI.isCopyLike())
    return true;

Could you possibly expand on “set[ting] different latencies for cross register class copies”? I see that there are interesting constructs for InstRW in other targets’ TableGen files. Our target currently uses what I believe is the legacy schedule info variant (InstrItinClass, InstrItinData, etc), and I don’t know if it’s possible to do the same.

The hook I think you’re referencing in TRI is getCrossCopyRegClass, which is only used in post-ISel scheduling, and doesn’t seem to have any bearing on the scheduling, just correctness.

There is a ‘getCopyCost()’, but it only handle intra-class copies.