Instruction Scheduling Itineraries

Hi Andy,

Could you describe how this would be done? In the current ARM itineraries
(say C-A9 for example), the superscalar issue stage is modelled as taking 1
cycle. If it were to take 2 cycles instead, as far as I can tell the hazard
analyser would stall because both FU's would be acquired.

I would like to model both issue width and pipeline depth. To save myself
explaining a possibly incorrect assumption again, could you please briefly
say how you expect that to be modelled and I can respond to that? Say for
example a simple M-wide, N-deep pipeline.



Hi James,

I'll try to describe how the itinerary works a bit. It's nonintuitive.

The itinerary has two lists, a list of pipeline stages and a list of
operand latencies. The latency of an instruction is captured by the
latency of its "definition" operands, so latency does not need to be
modeled in the pipeline stages at all.

A 2 wide, 1 deep pipeline (2x1) would be:

[InstrStage<1, [Pipe0, Pipe1]>]

A 2 wide, 4 deep pipeline (2x4) would be:

[InstrStage<1, [Pipe0, Pipe1]>]

Surprise. There is no difference in the pipeline description, because
the units are fully pipelined and we don't need to express latency
here. (I'm only showing the pipeline stages here, not the operand latency list).

Let's say you want to treat each stage of a pipeline as a separate
type of unit:

stage0: Decode
stage1: Exec
stage2: Write

[InstrStage<1, [Decode0, Decode1], 0>,
InstrStage<1, [Exec0, Exec1], 0>,
InstrStage<1, [Write0, Write1, 0]>]

Now when the first instruction is scheduled, it fills in the current
row of the reservation table with Decode0, Exec0, Write0. This is
counterintuitive because the instruction does not execute on all units
in the same cycle, but it results in a more compact reservation table
and still sufficiently models hazards.

Things only get more complicated if you have functional units that are
not fully pipelined, or you have instructions that use the same functional
units at different pipeline stages.

If I have an instruction that consumes a functional unit for 2 cycles,
during which no other instruction may be issued to that unit, then I
need to do this:

[InstrStage<2, [NonPipelinedUnit]>

If I have an instruction that splits into two dependent microops, that
use the same type of functional unit, but at different times, then I need to
do this:

[InstrStage<1, [ALU0, ALU1], 1>
InstrStage<1, [ALU0, ALU1]>



// Instruction stage - These values represent a non-pipelined step in
// the execution of an instruction. Cycles represents the number of
// discrete time slots needed to complete the stage. Units represent
// the choice of functional units that can be used to complete the
// stage. Eg. IntUnit1, IntUnit2. NextCycles indicates how many
// cycles should elapse from the start of this stage to the start of
// the next stage in the itinerary. For example:
// A stage is specified in one of two ways:
// InstrStage<1, [FU_x, FU_y]> - TimeInc defaults to Cycles
// InstrStage<1, [FU_x, FU_y], 0> - TimeInc explicit

class InstrStage<int cycles, list<FuncUnit> units,
                 int timeinc = -1,
                 ReservationKind kind = Required> {
  int Cycles = cycles; // length of stage in machine cycles
  list<FuncUnit> Units = units; // choice of functional units
  int TimeInc = timeinc; // cycles till start of next stage
  int Kind = kind.Value; // kind of FU reservation