Scheduler Integration Questions

Hello llvm-dev,

I’m doing some experimentation on instruction scheduling and would like to use LLVM as a testbed, by integrating our existing (compiler-agnostic) scheduler into it. I have tinkered enough with the LLVM code to know how to create and run a new scheduler, access the DAG and target info, etc. However, I’ve come upon some questions that I have been unable to answer so far, and I’m hoping you could help me with them.

We’re mainly targeting x86, and to a lesser extent SPARC. The target definitions for these two platforms seem to indicate that neither have any functional units or issue slots/rates defined. Likewise, when running some scheduling tests, all nodes in the DAGs have a latency of 1. Our assumption is that for x86 it was deemed unimportant to do proper scheduling in software, since the architecture does a lot of scheduling in hardware, or it was deemed too difficult to define functional units and latencies for every subtarget in the x86 family. As for SPARC, it seems to us that LLVM support for it is generally lacking (e.g. the 16 extra non-overlapping DFP registers are not defined for SPARCv9). Could you please confirm these guesses? Are there any plans to eliminate these deficiencies?

Another question I have is regarding register pressure estimates. I was wondering how to go about tracking the number of registers available on a target. The current schedulers (as of LLVM 2.8) use getRegPressureLimit(), but from what I can see the limits are rather rough - e.g. x86 gives a limit of 4 GP32 registers (presumably E[ABCD]X), even though, to the best of my knowledge, there are more generally available for allocation (e.g. SI, DI). It is also only taking the “top” register classes into account (e.g. counting a usage of AL and AH as two 32-bit registers). Are we interpreting these limits incorrectly? Can you suggest a better way to estimate them?

Thank you!

Regards,
Max

Hello llvm-dev,

I'm doing some experimentation on instruction scheduling and would like to use LLVM as a testbed, by integrating our existing (compiler-agnostic) scheduler into it. I have tinkered enough with the LLVM code to know how to create and run a new scheduler, access the DAG and target info, etc. However, I've come upon some questions that I have been unable to answer so far, and I'm hoping you could help me with them.

We're mainly targeting x86, and to a lesser extent SPARC. The target definitions for these two platforms seem to indicate that neither have any functional units or issue slots/rates defined. Likewise, when running some scheduling tests, all nodes in the DAGs have a latency of 1. Our assumption is that for x86 it was deemed unimportant to do proper scheduling in software, since the architecture does a lot of scheduling in hardware, or it was deemed too difficult to define functional units and latencies for every subtarget in the x86 family. As for SPARC, it seems to us that LLVM support for it is generally lacking (e.g. the 16 extra non-overlapping DFP registers are not defined for SPARCv9). Could you please confirm these guesses? Are there any plans to eliminate these deficiencies?

There aren't any functional unit definitions at the moment that are used in scheduling. Patches would be welcome.

Another question I have is regarding register pressure estimates. I was wondering how to go about tracking the number of registers available on a target. The current schedulers (as of LLVM 2.8) use getRegPressureLimit(), but from what I can see the limits are rather rough - e.g. x86 gives a limit of 4 GP32 registers (presumably E[ABCD]X), even though, to the best of my knowledge, there are more generally available for allocation (e.g. SI, DI). It is also only taking the "top" register classes into account (e.g. counting a usage of AL and AH as two 32-bit registers). Are we interpreting these limits incorrectly? Can you suggest a better way to estimate them?

This has changed in llvm 2.9 a bit to be more accurate. However, we don't attempt to use a single register for more than one value. I suggest using ToT and reading the code in ScheduleDAGRRList.cpp.

-eric

Hi Eric,

Thanks for your reply.

There aren't any functional unit definitions at the moment that are used in scheduling. Patches would be welcome.

That's a pity. I'm not sure whether we can manage to add proper
functional unit and latency support by ourselves but I'll post on the
list if anything comes out of it.

This has changed in llvm 2.9 a bit to be more accurate. However, we don't attempt to use a single register for more than one value. I suggest using ToT and reading the code in ScheduleDAGRRList.cpp.

I've skimmed the changes and that does look more precise. Thanks.

Regards,
Max