The VLIW-aware pre-allocation scheduler used by Hexagon is a mostly-generic alternative to the default MachineScheduler implementation that attempts to order instructions in such a way as to optimize the number of instructions issued per cycle. It does this by tracking available resources (DFAPacketizer) and balancing register pressure. This is a departure from the default list scheduler, which has no resource tracking and does not attempt to model instructions executing in parallel, instead optimizing the straight-line path length of the DAG.
I intend to generalize these data structures…
…by encapsulating Hexagon-specific behavior in virtual overloads of a generic API. There are only a couple places where target-specific behavior is explicit. These are identified quickly by uses of HexagonInstrInfo as identifier ‘QII’. The algorithm itself is likely tuned specifically for Hexagon, but is still generally applicable.
The port is relatively straightforward, requiring only 3 overrides. Testing is less defined at the moment. I’m currently running a full LLVM regression, but this type of change certainly prefers performance regressions.
Are there any opinions on lifting this pass?
Are there any publicly available Hexagon benchmarks with which I can verify my changes? Alternatively, are any in the Hexagon development team available to test such a change internally?