Question on instruction itineraries

Hi everyone

I’m fairly new with LLVM and I’ve been searching around but couldn’t find info on this subject.
I started working on a target for a new cpu and I realizing my initial simple understanding of instruction itineraries may be completely off.
I’m trying to model a CPU that has a latency of 2 cycles for multiplications fully pipelined (so it can start a new one after one cycle)
First of all, is there a document that describes the instruction itinerary model in some detail ?

For example looking at MBlaze target MBlaeSchedule.td I can see something like

MblazeSchedule.td

def IIImul : InstrItinClass;


InstrItinData<IIImul , [InstrStage<17, [IMULDIV]>]>,

Does that mean Mul’s are expected to have a latency of 17 clks ? Mips target has something similar.
In Mblaze case I can see the result being used the very next cycle

mul r3, r6, r5
addik r3, r3, 4

similarly for my target (instead of 17 I’m specifying 2 above) and for Mips I get the same result. Same for loads where I’m also specifying a larger latency

What would be the right way to specify a latency of 2 with 1 clk initiation interval for instance

Thanks
Miguel

Hi everyone
I'm fairly new with LLVM and I've been searching around but couldn't find
info on this subject.
I started working on a target for a new cpu and I realizing my initial
simple understanding of instruction itineraries may be completely off.
I'm trying to model a CPU that has a latency of 2 cycles for multiplications
fully pipelined (so it can start a new one after one cycle)
First of all, is there a document that describes the instruction itinerary
model in some detail ?
For example looking at MBlaze target MBlaeSchedule.td I can see something
like
MblazeSchedule.td
...
def IIImul : InstrItinClass;
...
InstrItinData<IIImul , [InstrStage<17, [IMULDIV]>]>,

Does that mean Mul's are expected to have a latency of 17 clks ? Mips target
has something similar.

Yes.

In Mblaze case I can see the result being used the very next cycle
mul r3, r6, r5
addik r3, r3, 4
similarly for my target (instead of 17 I'm specifying 2 above) and for Mips
I get the same result. Same for loads where I'm also specifying a larger
latency

Specifying a schedule doesn't really do anything if there isn't
anything which can be scheduled between the two instructions.

What would be the right way to specify a latency of 2 with 1 clk initiation
interval for instance

If you need to insert NOP's between certain instructions, you should
use a separate pass to do that. See MipsDelaySlotFiller.cpp for an
example of such a pass.

-Eli

Thanks Eli. Somehow I was assuming the scheduler would insert NOPs to enforce latencies
The CPU I’m dealing with doesn’t automatically stall, i.e. latency must be ensured by the program.
As an alternative to a pass, is it feasible to modify the scheduler to do so (optionally) or it would be too complicated.
If possible, what would be the right place to look ?

Thanks so much
Miguel

Try grep'ing for NoopHazard. Apparently there is some infrastructure,
and it's sort of used by the PPC backend, but I'm not sure how well it
works.

-Eli