I was looking over the experimental MIPS backend and noticed that it has a delay slot pass which just inserts nops into the delay slots. I assume it should be possible to do a bit better than this. Is there an existing pass which "fills" delay slots or would I have to write one if I wanted slightly more optimal code? (anyone have any references?)
Thanks in advance.
I don't think there's an existing optimizing delay slot filler. The
LLVM SPARC implementation also has a delay slot filler, but it's
almost identical to the MIPS one.
It's not that difficult to move an instruction; you'll get the idea
pretty quickly looking at MachineBasicBlock.h. Of course, finding an
instruction which can be moved is the tricky part; as a first step,
you could try moving the instruction immediately before a direct
unconditional branch into the delay slot.