Much experience has taught me not to trust register allocation papers. They never actually talk about performance. If I were reviewer, I might accept a paper based on novelty of the algorithm (far too many papers are rejected simply because they can't show a 20% speedup) but I wouldn't give points for reducing the number of spills and reloads.
Those counts simply don't mean anything in the real world.
Sorry to barge in on this thread, but that last sentence there caught me
This certainly is the case with desktop CPUs, where the hardware designers
have gone to a lot of bother adding hardware to perform dynamic
rescheduling and register renaming, which effectively replace these stack
accesses with registers or access to fast cache.
But, with upcoming architectures - particularly ones with a very large
number of cores (e.g. something along the lines of Larrabee, or Ambric,
and a plethora of others) - such hardware is too costly. As a result,
needless stack activity consumes available memory bandwidth, which
absolutely hammers instruction-level parallelism.
I certainly agree that a fast default register allocator is the best
strategy for LLVM, considering its main use. But it would be very nice to
have an optional allocator that does minimise spilling, at the cost of
I thought it best to raise awareness of where common assumptions (which
certainly came around for good reason) can break down in real-world
situations, so that progress can be made.