RFC: Stop using redundant PHI node entries for multi-edge predecessors

Today, the IR requires that if you have multiple edges from A to B (typically with a switch) any phi nodes in B must have an equal number of entries for A, but that all of them must have the same value.

This seems rather annoying…

  1. It creates multiple uses of values in A for no apparently good reason
  2. It makes updating PHI nodes using sets of predecessors incredibly hard
  3. There is no correspondence between the edges and the PHI node entries other than number, making most of the code that does handle this very ad-hoc “count off N repetitions of the same thing” style code, which is brittle and hard to test effectively.
  4. I suspect bugs are quite common given that PHINode::removeIncomingValue accepts a basic block but only removes the first incoming value, leaving the others in place.

I can’t see any serious use case for this symmetry either. These aren’t uses (and can’t be), and there doesn’t seem to be any lost functionality if we only have a single entry.

It also seems likely to open up more efficient in-memory representations if it is possible at some point to build maps of these things.

Thoughts? It appears entirely possible to change the verifier to check that we don’t have duplicates, and to read bitcode with duplicates and merge them so that there is no backwards compatibility issue.

-Chandler

Today, the IR requires that if you have multiple edges from A to B
(typically with a switch) any phi nodes in B must have an equal number of
entries for A, but that all of them must have the same value.

This seems rather annoying....
1) It creates multiple uses of values in A for no apparently good reason
2) It makes updating PHI nodes using sets of predecessors incredibly hard
3) There is no correspondence between the edges and the PHI node entries
other than number, making most of the code that does handle this very
ad-hoc "count off N repetitions of the same thing" style code, which is
brittle and hard to test effectively.

Having written this code a few times, i'd agree.

4) I suspect bugs are quite common given that PHINode::removeIncomingValue
accepts a basic block but only removes the first incoming value, leaving
the others in place.

I can't see any serious use case for this symmetry either. These aren't
uses (and can't be), and there doesn't seem to be any lost functionality if
we only have a single entry.http://lists.llvm.org/cgi-bin/
mailman/listinfo/llvm-dev

It also seems likely to open up more efficient in-memory representations
if it is possible at some point to build maps of these things.

Thoughts?

So would this lead to a case where PHI->getNumOperands() !=
std::distance(pred_begin(phiblock), pred_end(phiblock))

If so, that would seem problematic to handle as well.
:frowning:

Hi,

How problematic?

The places I’ve looked at that cared had to build a data structure anyways (the verifier builds two vectors and sorts them). Having them build sets wouldn’t seem problematic.

But I’ve only looked at a couple of places, which is somewhat the point of this email – I’d like to know if we’re doing this for good reasons, and ideally understand those reasons.

Also, IIUC, today deleting an edge from A to B only requires
removeIncomingValue(A) on B’s PHI nodes, but after this change you’ll
have to check if you’re deleting the last edge or not.

Are there many places where this is a feature rather than a bug though?

I didn’t do a survey of every caller of removeIncomingValue, but the 4 or 5 I looked at were all inside of a loop already. Which actually means they would get more efficient with this change, if in some cases needing to keep a set around.

This doesn’t seem too bad to me, but again, maybe I’m looking at the wrong bits of code. Is there specific code you have in mind that wants to remove one edge (but not one predecessor block)?

Can (2), (3) and (4) be fixed by changing the API instead of the
deeper IR change you’re proposing?

Maybe, but I don’t see easy ways. Ideas?

Hi,

Also, IIUC, today deleting an edge from A to B only requires
removeIncomingValue(A) on B's PHI nodes, but after this change you'll
have to check if you're deleting the last edge or not.

Are there many places where this is a feature rather than a bug though?

I didn't do a survey of every caller of removeIncomingValue, but the 4 or 5
I looked at were all inside of a loop already. Which actually means they
would get more efficient with this change, if in some cases needing to keep
a set around.

This doesn't seem too bad to me, but again, maybe I'm looking at the wrong
bits of code. Is there specific code you have in mind that wants to remove
one edge (but not one predecessor block)?

I think most calls to BasicBlock::removePredecessor would need
auditing. For instance, in llvm::ConstantFoldTerminator when we
transform:

BB:
  br true, A, B

to

BB:
  br A

then we only need to call B->removePredecessor(BB) which, in turn,
only needs to PhiInB->removeIncomingValue(BB). With your
representation we'll have to worry if A == B.

Another example is JumpThreading:

// If the terminator is branching on an undef, we can pick any of the
// successors to branch to. Let GetBestDestForJumpOnUndef decide.
if (isa<UndefValue>(Condition)) {
  unsigned BestSucc = GetBestDestForJumpOnUndef(BB);

  // Fold the branch/switch.
  TerminatorInst *BBTerm = BB->getTerminator();
  for (unsigned i = 0, e = BBTerm->getNumSuccessors(); i != e; ++i) {
    if (i == BestSucc) continue;
    BBTerm->getSuccessor(i)->removePredecessor(BB, true);
  }
  // ..
}

where we'll have to worry about if BestSucc'th successor is the same
as any other successor or not with your representation change.

I didn't look beyond these two.

Can (2), (3) and (4) be fixed by changing the API instead of the
deeper IR change you're proposing?

Maybe, but I don't see easy ways. Ideas?

For (2), will it help if there was a setIncomingValueForBlock(BB, V)
that DTRT with setting (all) the incoming value(s) for BB to V?

For (4) perhaps we could add a removeAllIncomingValuesForBlock helper?

I'm not sure what exactly you meant by (3).

-- Sanjoy

Hi,

Also, IIUC, today deleting an edge from A to B only requires
removeIncomingValue(A) on B’s PHI nodes, but after this change you’ll
have to check if you’re deleting the last edge or not.

Are there many places where this is a feature rather than a bug though?

I didn’t do a survey of every caller of removeIncomingValue, but the 4 or 5
I looked at were all inside of a loop already. Which actually means they
would get more efficient with this change, if in some cases needing to keep
a set around.

This doesn’t seem too bad to me, but again, maybe I’m looking at the wrong
bits of code. Is there specific code you have in mind that wants to remove
one edge (but not one predecessor block)?

I think most calls to BasicBlock::removePredecessor would need
auditing. For instance, in llvm::ConstantFoldTerminator when we
transform:

BB:
br true, A, B

to

BB:
br A

then we only need to call B->removePredecessor(BB) which, in turn,
only needs to PhiInB->removeIncomingValue(BB). With your
representation we’ll have to worry if A == B.

I guess I expect that after B->removePredecessor(BB), BB is not a predecessor any more.

I guess you’re saying that a ‘predecessor’ is really an ‘edge’. Ok, if confusing to me.

Still, wouldn’t we just need to audit the implementation of removePredecessor? It should hide all of this. My real point was that directly manipulating the PHI node seems to be rare.

Can (2), (3) and (4) be fixed by changing the API instead of the
deeper IR change you’re proposing?

Maybe, but I don’t see easy ways. Ideas?

For (2), will it help if there was a setIncomingValueForBlock(BB, V)
that DTRT with setting (all) the incoming value(s) for BB to V?

For (4) perhaps we could add a removeAllIncomingValuesForBlock helper?

These, IMO, turn the PHI incoming values into an inefficient set. ;] We’ll still be reasoning about whether we’re decreasing the count of edges to zero or not, but we’ll be storing an incoming value and a block for each edge instead of just counting the edges that are already there.

Somewhat worse, this duplication is magnified by the number of PHI nodes.

I’m not sure what exactly you meant by (3).

If I want to change the incoming value for one edge, but not for others, I find it confusing that I just find any old incoming value for that edge and remove it. It just seems odd that we keep N copies, but they don’t have any semantic beyond the count of them being N. We never check which edge we flow in through to pick which of the N copies to use because we insist that they’re all the same.

Anyways, this isn’t causing any immediate pain, I just wanted to understand why we’re using what seems like a very confusing representation to me.

Hi,

I have got experience of low-level manipulations of Phi-nodes whilst I was implementing an aggressive jump threading optimization for switches.
Chandler's proposal will simplify code.

Cases I had:

1. A->B->C, A has multi-edges to B. A is redirected to C.

I had to write the code something like this:

int NumEdgesToB = changeteTerminatorTarget(A->getTerminator(), B, C);
for (phi: all Phi-nodes in C) {
   V = get incoming value to C from B
   for (int i = 0; i < NumEdgesToB; ++i) {
       phi->addIncoming(V, A);
   }
}

I also had to update phi-nodes in B like this:

   for (int i = 0; i < NumEdgesToB; ++i) {
       PhiInB-> removeIncomingValue (B, DontDeletePHIIfEmpty);
   }

2. I needed to analyze all values incoming to B. As there can be multi-edges I had to write the code something like this:

SmallPtrSet<BasicBlock *, 8> SeenBlocks;
for (unsigned I = 0, E = CurPhi->getNumIncomingValues(); I != E; ++I) {
    if (!SeenBlocks.insert(CurPhi->getIncomingBlock(I)).second) {
      continue;
    }
   ...
}

3. I copied A into A'. I needed correct phi-nodes in successors of A. I had to write code like this:

      int VCount = 0;
      for (unsigned Id = 0, IdE = Phi->getNumIncomingValues(); Id != IdE; ++Id) {
        if (Phi->getIncomingBlock(Id) != OriginalBlock)
          continue;
        ++VCount;
      }

      for (; VCount > 0; --VCount)
        Phi->addIncoming(V, BlockCopy);

4. In order not to deal with duplicated incoming values I had to simplify a phi-nodes Def-Use graph by making some copy of it and removing redundant incoming values.

So in all places where we modify/access a phi-node we have to take into account redundant incoming values and preserve this feature.

> So would this lead to a case where PHI->getNumOperands() !=
> std::distance(pred_begin(phiblock), pred_end(phiblock))

I cannot imagine when this kind of code could be needed. When we work with phi-nodes we almost work with DFG they are part of. The code you provided is an attempt of mixing DFG and CFG. I don't this is good.
As Chandler wrote connections among phi-node values and edges are not expressed in any way.

Also, IIUC, today deleting an edge from A to B only requires
removeIncomingValue(A) on B's PHI nodes, but after this change you'll have
to check if you're deleting the last edge or not.

All cases I've seen required to add/remove all multi-edges. It's interesting to see real-life cases when not all edges are changed.

Thanks,
Evgeny Astigeevich

Hi,

I have got experience of low-level manipulations of Phi-nodes whilst I was
implementing an aggressive jump threading optimization for switches.

And i've done it a lot in GVN and NewGVN

Chandler's proposal will simplify code.

With no offense to him, i strongly doubt this :slight_smile:

Cases I had:

1. A->B->C, A has multi-edges to B. A is redirected to C.

I had to write the code something like this:

int NumEdgesToB = changeteTerminatorTarget(A->getTerminator(), B, C);
for (phi: all Phi-nodes in C) {
   V = get incoming value to C from B
   for (int i = 0; i < NumEdgesToB; ++i) {
       phi->addIncoming(V, A);
   }
}

Yup.

I also had to update phi-nodes in B like this:

   for (int i = 0; i < NumEdgesToB; ++i) {
       PhiInB-> removeIncomingValue (B, DontDeletePHIIfEmpty);
   }

Yup

This seems perfectly normal to me, and you would need it regardless of what
happens to switch statements :slight_smile:

br i1 undef, bb2, bb2
etc

2. I needed to analyze all values incoming to B. As there can be
multi-edges I had to write the code something like this:

SmallPtrSet<BasicBlock *, 8> SeenBlocks;
for (unsigned I = 0, E = CurPhi->getNumIncomingValues(); I != E; ++I) {
    if (!SeenBlocks.insert(CurPhi->getIncomingBlock(I)).second) {
      continue;
    }
   ...
}

Errr, this is not safe in general

You actually *must* know whether there are single or multiple edges into a
block to know whether it is safe to propagate a value.

4. In order not to deal with duplicated incoming values I had to simplify
a phi-nodes Def-Use graph by making some copy of it and removing redundant
incoming values.

I'd love to understand why.
You actually need to deal with them anyway.

The only thing chandler's representation saves is space at the cost of time.
The analysis code should generally have to do precisely the same thing it
did before.

IE given:

switch (a)
{
  case 32: br foo
  case 64: br foo

}

Right now you would end up with:

phi({a, 1}, {a, 1})

You would know that, looking at this phi, it is unsafe to propagate into
the value.

If it is was phi({a, 1}), you cannot tell this, you'd also have to "mix the
dfg and cfg" as you say, to tell that there are multiple edges.

So in all places where we modify/access a phi-node we have to take into
account redundant incoming values and preserve this feature.

Yes, and you will still have to do much the same analysis, just you'll end
up doing it forwards from the switch, instead of also being able to do it
backwards from the phi.

> > So would this lead to a case where PHI->getNumOperands() !=
> > std::distance(pred_begin(phiblock), pred_end(phiblock))

I cannot imagine when this kind of code could be needed.

As i said, you often need to know how many edges exist into a block to know
whether it is safe to propagate along a path.

When we work with phi-nodes we almost work with DFG they are part of. The
code you provided is an attempt of mixing DFG and CFG.

Except that PHI nodes exist precisely for this purpose?

I don't this is good.

I'm going to strongly disagree.

As Chandler wrote connections among phi-node values and edges are not

expressed in any way.

???

Sure it is.

There is *always* a correspondence between phi nodes and incoming edges.
That is why phi nodes exist.
To be able to say "when i came from this edge, i have this value".

If you mean "the operands of the switch are not uses of the phi", that's
also true in every other compiler too.

This is even solvable if you like, by a bit of indirection, so that there
is a 1:1 mapping from case successor to a unique phi operand.