How to get started with instruction scheduling? Advice needed.

I need to add instruction scheduling for a new target which is a fairly simple in-order execution machine.

I’ve been watching this presentation from a 2014 LLVM dev meeting as it seems relevant:

“SchedMachineModel: Adding and Optimizing a Subtarget” http://llvm.org/devmtg/2014-10/Slides/Estes-MISchedulerTutorial.pdf

In this presentation the author says that there have been several ways to approach scheduling in LLVM over the years:

  • Pre 2008: SelectionDAGISel pass creates the ScheduleDAG from the SelectionDAG at the end of instruction selection

  • ScheduleDAG works on SelectionDAG Nodes (SDNodes)

  • Circa 2008: Post Register
    Allocation pass added for
    instruction selection ( SchedulePostRATDList
    works on MachineInstrs)

  • Circa 2012: MIScheduler
    (ScheduleDAGMI) added as
    separate pass for pre-RA
    scheduling

  • Circa 2014: MIScheduler
    adapted to optionally replace
    PostRA Scheduler

In the presentation he goes with defining a subclass of SchedMachineModel in the schedule .td file. And apparently with this approach there are no instruction itineraries.

So I’m wondering: what’s the current recommended way to approach this and does it depend on the type or target? (in-order, superscalar, out of order, VLIW…)?

Someone earlier started to define instruction itineraries for our target. Should I continue down this road or move over to the SchedMachineModel approach? Are there other recommended presentations/documents that I should be looking at?

Thanks.

Phil

So if I use the SchedMachineModel method, can I just skip itineraries?

Phil

I notice from looking at ARMScheduleA9.td that there seems to be a hybrid approach where they still have itineraries but also use SchedMachineModel:

// ===---------------------------------------------------------------------===//
// The following definitions describe the simpler per-operand machine model.
// This works with MachineScheduler and will eventually replace itineraries.

class A9WriteLMOpsListType<list writes> {
list Writes = writes;
SchedMachineModel SchedModel = ?;
}

// Cortex-A9 machine model for scheduling and other instruction cost heuristics.
def CortexA9Model : SchedMachineModel {
let IssueWidth = 2; // 2 micro-ops are dispatched per cycle.
let MicroOpBufferSize = 56; // Based on available renamed registers.
let LoadLatency = 2; // Optimistic load latency assuming bypass.
// This is overriden by OperandCycles if the
// Itineraries are queried instead.
let MispredictPenalty = 8; // Based on estimate of pipeline depth.

let Itineraries = CortexA9Itineraries;

// FIXME: Many vector operations were never given an itinerary. We
// haven’t mapped these to the new model either.
let CompleteModel = 0;
}

I’m guessing this is probably the way forward for my case since Itineraries seem to be already mostly defined.

Phil

Hi Phil.

You more or less answered your own question, but let me give you some more info. Maybe it is of use.

From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model and the heuristics of the pre-RA scheduling pass is probably a bit too focussed on register pressure for in-order cores (I have no numbers, just hearsay).

There is some documentation in comments at the start of include/llvm/Target/TargetSchedule.td that you might find useful. If you are going to look at an existing scheduling model, I suggest to look at an in-order core. A good example would be AArch64/AArch64SchedA53.td. If itineraries are present, they are used by the mi-scheduler next to the SchedMachineModel to detect hazards. I think that is the only place where the mi-scheduler uses itineraries.

There are some magic numbers you need for in-order operation. Most notably MicroOpBufferSize should be set to 0 for full in-order behaviour. You also want to set CompleteModel to 0 as that prevents asserts due to instructions without scheduling information. There is a script that might help you to visualise if you have provided scheduling information in the SchedMachineModel for all instructions (utils/schedcover.py). It is very simplistic and takes as input the debug output of tablegen. There are some usage comments at the beginning.

Regards,
Christof

Christoff,

Thanks for the reply. Comments below:

Hi Phil.

That schedcover.py script is only useful for per-operand mi-model and is rather new. I have never tried to debug itineraries, so can’t help you with that. I also cannot compare the different scheduler passes, sorry.

You can get debug info from a debug build of llvm by using the –debug or -debug–only=. For example -debug–only=misched will give you only the machine scheduler info. If the machine scheduler uses itineraries, you’ll see a line "Using scoreboard hazard recognizer” in the debug output. For other scheduling passes, I don’t know.

See http://llvm.org/docs/ProgrammersManual.html#the-debug-macro-and-debug-option for how debug info is controlled per pass.

Regards,
Christof

Hi Phil.

You more or less answered your own question, but let me give you some more info. Maybe it is of use.

From what I understand the SchedMachineModel is the future, although it is not as powerful as itineraries at present. The mi-scheduler is mostly developed around out-of-orders cores, I believe (I love to hear arguments on the contrary). Some of the constraints that can be found in in-order micro architectures cannot be expressed in the per-operand scheduling model and the heuristics of the pre-RA scheduling pass is probably a bit too focussed on register pressure for in-order cores (I have no numbers, just hearsay).

There is some documentation in comments at the start of include/llvm/Target/TargetSchedule.td that you might find useful. If you are going to look at an existing scheduling model, I suggest to look at an in-order core. A good example would be AArch64/AArch64SchedA53.td. If itineraries are present, they are used by the mi-scheduler next to the SchedMachineModel to detect hazards. I think that is the only place where the mi-scheduler uses itineraries.

There are some magic numbers you need for in-order operation. Most notably MicroOpBufferSize should be set to 0 for full in-order behaviour. You also want to set CompleteModel to 0 as that prevents asserts due to instructions without scheduling information. There is a script that might help you to visualise if you have provided scheduling information in the SchedMachineModel for all instructions (utils/schedcover.py). It is very simplistic and takes as input the debug output of tablegen. There are some usage comments at the beginning.

Having itinerary data should be enough for an instruction to count as covered for the “CompleteModel” case. I’d highly recommend to aim for “CompleteModel 1” in your targets, because it is easy to forget new instructions. It should also not be complicated to add empty scheduling information to a node as a temporary measure for cases where you have a reason not to provide scheduling information.
schedcover.py is indeed a nice tool to get a feeling/overview of your scheduling information. If schedcover.py shows no empty cells then “CompleteModel 1” should work as well.

  • Matthias