Hello,
I'm working on integrating the MachinePipeliner.cpp pass into our VLIW
backend, and so far we've managed to get it working with some nice
speedups.
Unlike Hexagon however, our backend doesn't generate hardware loop
instructions and so all our loops are a combination of induction
variables, comparisons and branches. So when it came to implementing
reduceLoopCount for our TargetInstrInfo, we found that we didn't have
enough information from analyzeLoop to reduce the loops.
Currently the signatures look like this:
bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
MachineInstr *&CmpInst)
unsigned TargetInstrInfo::reduceLoopCount(MachineBasicBlock &MBB,
MachineInstr *IndVar,
MachineInstr &Cmp,
SmallVectorImpl<MachineOperand> &Cond,
SmallVectorImpl<MachineInstr *> &PrevInsts,
unsigned Iter,
unsigned MaxIter) const
Since the condition operands for branching in our architecture are
found on the branch instruction and not the comparison instruction, we
weren't able to populate Cond in reduceLoopCount.
Furthermore, since some loops conditionally branched to exit the loop
whilst others conditionally branched to continue the loop, we sometimes
needed to invert these condition codes. (MachinePipeliner.cpp inserts
branches assuming that the Cond operands are the operands for *exiting*
the loop)
In the end we had to change the signatures to pass around a bit more
information:
bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
MachineInstr *&CmpInst, MachineInstr *&BranchInst,
bool *BranchExits)
unsigned reduceLoopCount(MachineBasicBlock &MBB,
MachineInstr *IndVar,
MachineInstr &Cmp,
MachineInstr &Exit,
bool BranchExits,
SmallVectorImpl<MachineOperand> &Cond,
SmallVectorImpl<MachineInstr *> &PrevInsts,
unsigned Iter,
unsigned MaxIter)
BranchInst allows us to get the operands required to pass back in Cond,
and BranchExits is set to true whenever the branch exits the loop, so
that we can then invert the condition if it doesn't exit the loop.
Would these changes be desirable upstream? As far as I'm aware Hexagon
doesn't use the IndVar instruction, and just passes along the hardware
loop instruction through CmpInst, so adapting it for this new API was
trivial.
Luke Lau