Question about method CodeExtractor::severSplitPHINodes

Hi folks,

Hope this is not a silly question. But it bothers me when I am thinking about it.

My question is:

  1. In the implementation of serverSplitPHINodes(), why it only checks the first PHI node for possible
    multiple inputs from outside the region to extract. There could be more than one PHI nodes in the header
    block, and the code only checks the first one. I don’t quite get it. What if the first PHI node contains only
    incoming values from inside the region, while subsequent PHI nodes have inputs from outside? Is that possible?

  2. Oddly, later in the code, it creates a corresponding new PHI node in the newBB for each PHI in the old header block. And
    then it adds the old PHI node as an incoming value to the newPHI node. This is Okay for the first PHI node of original header
    as it’s been proved to have multiple inputs from outside the region earlier. But it does not make sense for subsequent PHI nodes since they
    may contain only incoming values from inside. Then what’s the point of adding it as an incoming value to the new PHI when it
    doesn’t even have anything from outside. This looks redundant to me.

I could only guess it has something to do with the way PHI nodes get inserted in LLVM. Any help is much appreciated!!!

/// severSplitPHINodes - If a PHI node has multiple inputs from outside of the
00185 /// region, we need to split the entry block of the region so that the PHI node
00186 /// is easier to deal with.
00187 void CodeExtractor::severSplitPHINodes([BasicBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html) *&Header) {
00188   unsigned NumPredsFromRegion = 0;
00189   unsigned NumPredsOutsideRegion = 0;
00190 
00191   if (Header != &Header->[getParent](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#aca229503e4f5c83a187a6a921c625fa8)()->[getEntryBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1Function.html#a30f2c362631e3728d2f47a8203071ade)()) {
00192     [PHINode](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html) *PN = [dyn_cast](http://llvm.org/docs/doxygen/html/namespacellvm.html#ad6dbabbbb9495cf1501d64a5c71729ed)<[PHINode](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html)>(Header->[begin](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a0ed5f3ab3c2e4196ee0cffffa080c062)());
00193     if (!PN) return;  // No PHI nodes.
00194 
00195     // If the header node contains any PHI nodes, check to see if there is more
00196     // than one entry from outside the region.  If so, we need to sever the
00197     // header block into two.
00198     for (unsigned i = 0, e = PN->[getNumIncomingValues](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#aa45f6c0433576e3858a6209a43750ad4)(); i != e; ++i)
00199       if (Blocks.[count](http://llvm.org/docs/doxygen/html/classllvm_1_1SetVector.html#a0fd2953d62c1b1cabb87e420be5177c4)(PN->[getIncomingBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a4c25b6c00c4867281779c81ab64d2081)(i)))
00200         ++NumPredsFromRegion;
00201       else
00202         ++NumPredsOutsideRegion;
00203 
00204     // If there is one (or fewer) predecessor from outside the region, we don't
00205     // need to do anything special.
00206     if (NumPredsOutsideRegion <= 1) return;
00207   }
00208 
00209   // Otherwise, we need to split the header block into two pieces: one
00210   // containing PHI nodes merging values from outside of the region, and a
00211   // second that contains all of the code for the block and merges back any
00212   // incoming values from inside of the region.
00213   [BasicBlock::iterator](http://llvm.org/docs/doxygen/html/classllvm_1_1ilist__iterator.html) AfterPHIs = Header->[getFirstNonPHI](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a0e73f4e09745bb69fdd7b15232c45428)();
00214   [BasicBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html) *NewBB = Header->[splitBasicBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a19445f836d9e1ecb32cba27ec4338fff)(AfterPHIs,
00215                                               Header->[getName](http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html#ad452febc1ac0b394876e640ec03ffa38)()+".ce");
00216 
00217   // We only want to code extract the second block now, and it becomes the new
00218   // header of the region.
00219   [BasicBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html) *OldPred = Header;
00220   Blocks.[remove](http://llvm.org/docs/doxygen/html/classllvm_1_1SetVector.html#a6357ff5654c7cbe5fed8516dc8328eb4)(OldPred);
00221   Blocks.[insert](http://llvm.org/docs/doxygen/html/classllvm_1_1SetVector.html#a72d928b7fc2c5f2d56c6ac0265fd9c6e)(NewBB);
00222   Header = NewBB;
00223 
00224   // Okay, update dominator sets. The blocks that dominate the new one are the
00225   // blocks that dominate TIBB plus the new block itself.
00226   if (DT)
00227     DT->[splitBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1DominatorTree.html#a3d5044cf4966abb56510f1883d275745)(NewBB);
00228 
00229   // Okay, now we need to adjust the PHI nodes and any branches from within the
00230   // region to go to the new header block instead of the old header block.
00231   if (NumPredsFromRegion) {
00232     [PHINode](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html) *PN = cast<PHINode>(OldPred->[begin](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a0ed5f3ab3c2e4196ee0cffffa080c062)());
00233     // Loop over all of the predecessors of OldPred that are in the region,
00234     // changing them to branch to NewBB instead.
00235     for (unsigned i = 0, e = PN->[getNumIncomingValues](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#aa45f6c0433576e3858a6209a43750ad4)(); i != e; ++i)
00236       if (Blocks.[count](http://llvm.org/docs/doxygen/html/classllvm_1_1SetVector.html#a0fd2953d62c1b1cabb87e420be5177c4)(PN->[getIncomingBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a4c25b6c00c4867281779c81ab64d2081)(i))) {
00237         [TerminatorInst](http://llvm.org/docs/doxygen/html/classllvm_1_1TerminatorInst.html) *TI = PN->[getIncomingBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a4c25b6c00c4867281779c81ab64d2081)(i)->[getTerminator](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a5cb76a65b6524dba1493dd2b9dc3abbe)();
00238         TI->[replaceUsesOfWith](http://llvm.org/docs/doxygen/html/classllvm_1_1User.html#a1f0b9358936e3e00c42a460abbfb2868)(OldPred, NewBB);
00239       }
00240 
00241     // Okay, everything within the region is now branching to the right block, we
00242     // just have to update the PHI nodes now, inserting PHI nodes into NewBB.
00243     for (AfterPHIs = OldPred->[begin](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a0ed5f3ab3c2e4196ee0cffffa080c062)(); isa<PHINode>(AfterPHIs); ++AfterPHIs) {
00244       [PHINode](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html) *PN = cast<PHINode>(AfterPHIs);
00245       // Create a new PHI node in the new region, which has an incoming value
00246       // from OldPred of PN.
00247       [PHINode](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html) *NewPN = [PHINode::Create](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#afb5e83bf5123ff7c51058eb0ebebcdc6)(PN->[getType](http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html#a0cf3748dba54f931bb1241ae4adc76bc)(), 1 + NumPredsFromRegion,
00248                                        PN->[getName](http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html#ad452febc1ac0b394876e640ec03ffa38)()+".ce", NewBB->[begin](http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html#a0ed5f3ab3c2e4196ee0cffffa080c062)());
00249       NewPN->[addIncoming](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a089cccb6f231efee72abc76d0f9c695f)(PN, OldPred);
00250 
00251       // Loop over all of the incoming value in PN, moving them to NewPN if they
00252       // are from the extracted region.
00253       for (unsigned i = 0; i != PN->[getNumIncomingValues](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#aa45f6c0433576e3858a6209a43750ad4)(); ++i) {
00254         if (Blocks.[count](http://llvm.org/docs/doxygen/html/classllvm_1_1SetVector.html#a0fd2953d62c1b1cabb87e420be5177c4)(PN->[getIncomingBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a4c25b6c00c4867281779c81ab64d2081)(i))) {
00255           NewPN->addIncoming(PN->[getIncomingValue](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#aba6a4cc4ed6d6fef3664b8d65ef04820)(i), PN->[getIncomingBlock](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a4c25b6c00c4867281779c81ab64d2081)(i));
00256           PN->[removeIncomingValue](http://llvm.org/docs/doxygen/html/classllvm_1_1PHINode.html#a6f01dbe965b38186b1a78378689d4105)(i);
00257           --i;
00258         }
00259       }
00260     }
00261   }
00262 }

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Wei Dang
Subject: [LLVMdev] Question about method CodeExtractor::severSplitPHINodes

In the implementation of serverSplitPHINodes(), why it only checks the
first PHI node for possible multiple inputs from outside the region to
extract. There could be more than one PHI nodes in the header block,
and the code only checks the first one. I don't quite get it. What if
the first PHI node contains only incoming values from inside the region,
while subsequent PHI nodes have inputs from outside? Is that possible?

No, it's not possible. All PHI nodes must include inputs for _all_ predecessors of the block, so only one PHI node needs to checked for the origins.

- Chuck

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

Thanks for reply Chuck.
Please excuse me if I’m not supposed to reply to all.

Are you saying all PHI nodes are required to include all its predecessor blocks no matter they have input or not?
What about successor blocks? Are they optional if they don’t provide inputs?

BTW, where should I look at to verify this, the mem2reg.cpp & PromoteMemToReg.cpp?
Thanks a lot.

From: Wei Dang [mailto:jacdang@gmail.com]
Subject: Re: [LLVMdev] Question about method CodeExtractor::severSplitPHINodes

Please excuse me if I'm not supposed to reply to all.

You should do reply-all, to make sure the list sees all of the thread.

Are you saying all PHI nodes are required to include all its predecessor blocks
no matter they have input or not?

That's the way SSA works - each input to a block must be provided by all paths leading to the block. That must be from each immediate predecessor or from a common predecessor of two or more immediate predecessors that do not mutate the value.

What about successor blocks? Are they optional if they don't provide inputs?

The question doesn't make sense; we're only concerned with the predecessors here. If you have a loop, some successor (possibly the block of interest) is also a predecessor and must conform to SSA usage.

where should I look at to verify this, the mem2reg.cpp & PromoteMemToReg.cpp?

It's an inherent characteristic of SSA:

(or any of numerous textbooks and papers).

- Chuck

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

I appreciate the explanation Chuck. I guess I know where I got it wrong,
I thought phi node only considers incoming edges where there is a value flowing in.
I guess the truth is that any phi node should consider all incoming edges whether or not
there is an incoming value (or a ‘real’ defined incoming value)
For those edges without a real defined value, phi nodes just put undef as their value for that edge.
Am I right? This makes a lot sense now.