Hi all,
Why is partial unrolling off by default? Did enabling partial unrolling result in any significant runtime or compile time regressions in any benchmarks?
Hi all,
Why is partial unrolling off by default? Did enabling partial unrolling result in any significant runtime or compile time regressions in any benchmarks?
It's not off by default on all targets; it's controlled by the getUnrollingPreferences() target hook. The exact threshold which makes sense tends to be target-specific (especially for CPUs which have special optimizations for small loops).
-Eli