In my experience bottom-up works better when optimizing for register pressure while top-down tends to work better when optimizing for latencies/stall reduction.
That’s a very interesting insight, do you mind sharing the intuition behind it?
In my experience bottom-up works better when optimizing for register pressure while top-down tends to work better when optimizing for latencies/stall reduction.
That’s a very interesting insight, do you mind sharing the intuition behind it?
Thanks,
Alex
In the broadest generalization of the scheduling problem, if your DAG is shaped like a tree, then the number of available nodes at the beginning and end of scheduling will be affected by the order that you schedule.
If you schedule the “pointy” end of the tree first (aka the root), the there aren’t many choices at first, and the leaves gradually become available at the end of scheduling. If you schedule the leaves first, then the heuristic must choose from many available nodes at once.
When scheduling for resource utilization, it’s hard to correctly choose among many nodes that use the same resources. Scheduling the root first means less chance to get things wrong early on. In fact, if the DAG is strictly a tree, then bottom-up scheduling with Sethi-Ullman numbers easily minimizes registers. Of course, DAG’s aren’t actually tress, so there’s no clear right or wrong direction to schedule, but bottom-up tends to be pretty effective at minimizing registers if that’s the main constraint.
When scheduling for latency, you just need the critical path from each node, which is computed before scheduling begins and doesn’t change. So having more nodes available at first doesn’t hurt.