[RFC] Delay Phi Operand Folding

Hi llvm-dev,

I was thinking to make InstCombine capable of turning on/off the folding of Phi operands. I am not entirely sure this is a right approach. My motivation is similar to https://reviews.llvm.org/D50723. I noticed that InstCombine tries to fold Phi operands and therefore can sink instructions. When this happens too early in the pass pipeline, it can prevent other optimization passes like GVN-Hoist from being beneficial. I was looking at a particular case, where InstCombine would sink a chain of instructions from both sides of a diamond structure except for two geps that were in the beginning of the chain. Each side block had the exact sequence of instructions with the geps in reverse order. The Phis corresponding to the geps were then turned into selects by SimplifyCFG, resulting a sub-optimal sequence with the selects happening too early and presumably creating stalls in the execution pipeline:

IR just before InstCombine

/ \

Use(…(Use(load(gep 0)))) Use(…(Use(load(gep 1))))

Use(…(Use(load(gep 1)))) Use(…(Use(load(gep 0))))

\ /

IR after all passes

Use(…(Use(load(select(gep 0, gep 1)))))

Use(…(Use(load(select(gep 1, gep 0)))))

Applying my patch from https://reviews.llvm.org/D52568 allows GVN-Hoist to hoist the whole chain:

select(Use(…(Use(load(gep 0)))), Use(…(Use(load(gep 1)))))

select(Use(…(Use(load(gep 1)))), Use(…(Use(load(gep 0)))))

I am posting some performance numbers targeting Cortex-A57 AArch64 reported by LNT for llvm-test-suite, spec2000, and spec2006 at -O3 using a resent LLVM trunk revision with my patch applied.

Performance improvements in execution time: