Hi, Andrew,
Thank you for answering my question.
What’s the status of misched? is it experimental? I found it is disabled by default for all architectures(3.4svn). I also don’t understand the algorithm. Could you point to me more papers or text materials about your approach? it seems that you want to balance register pressure and ILP in misched.
It has been used in production for a year. It’s currently enabled on trunk for PPC, R600, and Hexagon. If there are no objections I’d like to move x86 and armv7 ASAP. Leaving it disabled is becoming more of a maintenance burden.
Please see my llvm-dev list messages to Ghassan yesterday. MI Scheduler is pass that just provides a place to do scheduling and a large toolbox to do it with. ScheduleDAGMI is a list scheduler driver, and the GenericScheduler strategy attempts to balance register pressure with latency. In my opinion getting the right register pressure vs latency balance is easy to do at a given point in time for a small benchmark suite, but very, very hard to do in general with a design that works across microarchitectures and is resilient to changes to incoming IR. GenericScheduler doesn’t magically solve this problem, but it should never do anything too terrible either.
The old itineraries allow specifying which resources are used in each pipeline stage. It’s a full matrix.
In the new machine model, you only specify the resources and number of cycles. It can be implemented with simple counters. This works in practice because it’s almost always the case that different instructions begin using a given resource at the same time relative to when the instruction is executed. Even the VLIW implementation I’ve seen in trunk could have used the new model.
It’s efficient because the scheduler doesn’t need to manage a reservation table or build a state machine.
It’s more flexible because predicates allow instructions to be modeled differently based on opcode extensions or immediate values.
The postRA hazard that your talking about is the job of the dependence graph builder. That is the same for both post-RA and MI sched. When the DAG builder runs before regalloc, it also has to handle virtual registers, that’s the only difference.
The best way for me to explain how to define a machine model for an in-order processor would be to work with someone who is ready to migrate mips or a simple ppc, arm, or x86 (atom) implementation and improve the docs along the way.
We’re also lacking a model for AVX!
-Andy