[VLIW Scheduler] Itineraries vs. per operand scheduling

Yatsina_Marina · February 4, 2018, 5:15pm

Hi,

What is the best way to model a scheduler for a VLIW in-order architecture?

I’ve looked at the Hexagon and R600 architectures and they are using itineraries. I wanted to understand the benefit in using itineraries over the per operand scheduling.

I also found this thread from almost 2 years ago:

http://lists.llvm.org/pipermail/llvm-dev/2016-April/098763.html

azru0512 · February 5, 2018, 12:55pm

Anyway, I also interest in knowing their difference.

Regards,

chenwj

atrick · February 8, 2018, 5:32am

Hi Marina,

Itineraries have stuck around mainly because no one was interested in porting old itineraries to a new machine description. At least initially, there also wasn't anyone interested in working on an in-order scheduling implementation based on the new machine model, even though the machine description itself was designed for in-order scheduling. There are a handful of targets now using the new model with MicroOpBufferSize = 0/1 (in-order mode). It's hard for me to say how widely supported it is, because so many backends using the MachineScheduler are out-of-tree.

The per-operand model should be much more compile time efficient, but that's often not a concern. It can be easier now to incrementally bootstrap new machine descriptions based on similar machines with the same architecture. The subtarget details no longer need to be tied to the TableGen instruction descriptions. The new model lets you define the structure of the subtarget scheduling description, so it can be self-documenting. The new model lets you use a plethora of tablegen tricks to handle all the variations on vector operations. I believe it's much more straightfoward to specify special constraints, like bypassing. It's fairly easy to extend certain opcodes with predicates (callbacks) that depend on immediate values or any other arbtrary info.

See Javed and Florian's tutorial on the "new" machine model:

Itineraries are not strictly more powerful. The original motivation for the new machine description is that there were important issue constraints that weren't being correctly modeled by the itineraries. The only advantage of itineraries that I'm aware of is that, for instructions that use multiple machine resources, you can control during which cycle those resources are used relative to each other (it specifies a full resource matrix for each op). The per-operand model makes the simplifying assumption that a given resource is simply used for N cycles. Normally, all instructions begin using a given resource on the same cycle relative to instruction issue, so that's not an innaccurate assumption. If you actually need to correct for that, and don't use itineraries then you need to make use of the the MachineScheduler's hooks to write a little custom scheduling code.

Whatever you do, there's no sense in using both kinds of machine descriptions. Support is only there for the purpose of bootstrapping using ARM A9, since that was the latest in-tree ARM cpu at the time. It's just useless complexity now.

-Andy

azru0512 · February 8, 2018, 2:49pm

Hi Krzysztof,

Krzysztof_Parzyszek · February 8, 2018, 7:40pm

We have a two different dimensions for each instruction: slot assignments, and operand timings. These two are unrelated to each other, and also each (or both) can change for any given instruction from one architecture version to the next.

The main concern for us was which of these mechanisms contains all the information that we need. We cannot express all the scheduling details by hand, and majority of it was auto-generated anyway. I don't know if the new model has all the required pieces of information, but we've been using itineraries for a while, and we stuck with them.

The short answer is "because it works", but it's not meant to imply that nothing else would.

-Krzysztof

atrick · February 9, 2018, 12:32am

We have a two different dimensions for each instruction: slot assignments, and operand timings. These two are unrelated to each other, and also each (or both) can change for any given instruction from one architecture version to the next.

The main concern for us was which of these mechanisms contains all the information that we need. We cannot express all the scheduling details by hand, and majority of it was auto-generated anyway. I don't know if the new model has all the required pieces of information, but we've been using itineraries for a while, and we stuck with them.

The short answer is "because it works", but it's not meant to imply that nothing else would.

-Krzysztof

That reminds me, in the new model, there are two different ways to define issues constraints:

ARMScheduleSwift has two ALU pipes. Some instructions can issue on either one, so we define a superset “P01” resource like this:
  def SwiftUnitP01 : ProcResource<2>; // ALU unit.
  def SwiftUnitP0 : ProcResource<1> { let Super = SwiftUnitP01; } // Mul unit.
  def SwiftUnitP1 : ProcResource<1> { let Super = SwiftUnitP01; } // Br unit.

Haswell allows an instruction to issue an any arbitrary set of ports, so it uses groups:

def HWPort1 : ProcResource<1>;
def HWPort2 : ProcResource<1>;
def HWPort3 : ProcResource<1>;
def HWPort4 : ProcResource<1>;
def HWPort5 : ProcResource<1>;
def HWPort6 : ProcResource<1>;
def HWPort7 : ProcResource<1>;

// Many micro-ops are capable of issuing on multiple ports.
def HWPort01 : ProcResGroup<[HWPort0, HWPort1]>;
def HWPort23 : ProcResGroup<[HWPort2, HWPort3]>;
def HWPort237 : ProcResGroup<[HWPort2, HWPort3, HWPort7]>;
…

IIRC: they’re modeled the same way and ProcRegGroup is just computing superset relations for you.
And an instruction on pipe/port “01” can issue on either dynamically, so a subsequent pipe/port “0” instruction could issue in the same cycle.

For something like slot assignments, I would be tempted to write an small independent state machine as part of the hazard checker, then use the machine model (either itinerary or per-operand) just for operand timings.

-Andy

Yatsina_Marina · February 11, 2018, 9:13am

Thank you so much for the detailed answer!

Our target only uses one resource at a time, so based on what you said, itineraries do not have any advantage over the new scheduling model.

The per operand model seems to be the way to go.

Thanks,

Marina

Topic		Replies	Views
How to get started with instruction scheduling? Advice needed. LLVM Dev List Archives	6	487	April 29, 2016
InstrItin and SchedWriteRes LLVM Dev List Archives	3	364	April 6, 2018
Is there a way to convert between SchedMachineModel and Itineraries? LLVM Dev List Archives	10	211	November 11, 2015
New machine model questions LLVM Dev List Archives	6	170	January 30, 2014
Question about per-operand machine model LLVM Dev List Archives	12	251	March 11, 2014

[VLIW Scheduler] Itineraries vs. per operand scheduling

Related topics