IC value with reduction on LoopVectorizationCostModel::selectInterleaveCount

Hi all,

I am trying to understand in which basis the loop-vectorizer optimization
optimal Interleave Count (IC) with reductions was added:


4773 // Interleave if we vectorized this loop and there is a reduction that could
4774 // benefit from interleaving.
4775 if (VF > 1 && !Legal->getReductionVars()->empty()) {
4776 LLVM_DEBUG(dbgs() << "LV: Interleaving because of reductions.\n");
4777 return IC;
4778 }

The IC in this context will be within [1, MaxInterleaveCount] and
MaxInterleaveCount will be set based target defaults. The issue is for
unbounded loops (when trip count can't be infered) where vectorization is
beneficial even for small element count, the loop-vectorization will use the
architecture defined IC. And if the arch-defined IC is higher than 2 the
vectorization code path won't be used element count less than IC*VF.

For instance the code snippet: