Implementing modulo variable expansion for MachinePipeliner

Thanks for the information.
I understood that pre-pipeline unrolling is also needed to fill slots (i.e. reduce the ratio of loop control instructions) and to split accumulations, which cannot be achieved by post-pipeline unrolling.
I think these could be achieved by interleaving in the LoopVectorize pass. However, some improvements may be necessary to ensure that the best factors are selected.