[RFC] Simplify `RegionBranchOpInterface`: Separate "Successor Inputs" from "Region Successor"

tldr: Simplify the RegionBranchOpInterface by removing support for certain data flow patterns: the property of “being a successor input” no longer depends on the source of a region control flow edge. The proposed change removes expressivity from the op interface.

Skip background sections if you are familiar with the RegionBranchOpInterface.

Background: Notation and Terms

%r = scf.for %iv = ... iter_args(%arg0 = %0) {
  // ...
  scf.yield %1
}

%r and %arg0 are successor inputs. %0 and %1 are successor operands. Successor inputs are SSA values that are simply forwarded from successor operands. Note that both %iv and %arg0 are block arguments of the region, but %iv is not a successor input because its value is determined by the scf.for operation “itself”. (It’s not just forwarded.)

Background: RegionBranchOpInterface

The RegionBranchOpInterface models region control flow and data flow. It has an interface method to query control flow edges together with successor inputs.

void ForOp::getSuccessorRegions(RegionBranchPoint point,
                                SmallVectorImpl<RegionSuccessor> &regions) {
  // Both the operation itself and the region may be branching into the body or
  // back into the operation itself. It is possible for loop not to enter the
  // body.
  regions.push_back(RegionSuccessor(&getRegion(), getRegionIterArgs()));
  //                                              ^^^^^^^^^^^^^^^^^^^
  //                                               successor inputs
  regions.push_back(RegionSuccessor(getOperation(), getResults()));
  //                                ^^^^^^^^^^^^^^  ^^^^^^^^^^^^
  //                                 "parent"       successor inputs
}

A “region branch point” is the source of a control flow edge. A region successor is the destination of a control flow edge. There are special “parent” source/destination, which indicate branching into / out of the op.

                                        CF edge
region branch point (sucessor operands)  ---->  region successor (successor inputs)

Overgeneralized Design

The RegionBranchOpInterface may be too general/complex and allow for cases that are not needed in the real world. This is presumably one of the reasons that has lead to incorrect usage of the interface in at least two places in core MLIR: dataflow analysis framework and SliceWalk.cpp. There are likely more such places.

Consider this hypothetical operation with two regions:

// if-like operation with 2 results but a single yielded value per region.
%r0, %r1 = "my_dialect.region_op"(%condition) (
{  // regionA
^bb0:
  "my_dialect.yield"(%x) : (i32) -> ()  // yield A
}, {  // regionB
^bb1:
  "my_dialect.yield"(%y) : (i32) -> ()  // yield B
}) : (i1) -> (i32, i32)

And the following region branching behavior:

void MyDialect::RegionOp::getSuccessorRegions(RegionBranchPoint point,
                                              SmallVectorImpl<RegionSuccessor> &regions) {
  if (point.isParent()) {
    // When entering the operation for the first time, we dispatch to regionA
    // or regionB based on the value of %condition.
    regions.push_back(RegionSuccessor(&getRegionA()));
    regions.push_back(RegionSuccessor(&getRegionB()));
  } else if (point.getTerminatorPredecessorOrNull() == getYieldA()) {
    // When leaving from region A, we leave the entire region op and the yielded
    // value is forwarded as %r0.
    regions.push_back(RegionSuccessor(&getOperation(), getResult(0)));
  } else if (point.getTerminatorPredecessorOrNull() == getYieldB()) {
    // When leaving from region B, we leave the entire region op and the yielded
    // value is forwarded as %r1.
    regions.push_back(RegionSuccessor(&getOperation(), getResult(1)));
  }
}

Note that either %r0 or %r1 are successor inputs, depending on the region from where we are coming. When coming from region A, %r0 is the forwarded value. When coming from region B, %r1 is the forwarded value.

Is this useful? Do we need such a general RegionBranchOpInterface design?

Proposal: Separate Successor Inputs from Region Branch Point

To simplify the design, I propose to separate successor inputs from region branch points. I.e., a block argument / op result is a successor input regardless of where the branch originated from. Data flow such as the one shown above in "my_dialect.region_op" would no longer be expressible with the interface.

The proposed change is a breaking change: if somebody is relying on this functionality, they would no longer be able to use the RegionBranchOpInterface and potentially analyses/transformations that build on top of it such as the MLIR dataflow analysis framework.

If you have such a use case: please reply to this RFC!

Implementation Sketch

  1. Remove ValueRange inputs; from class RegionSuccessor. We could even switch to using RegionSuccessor = Region *;, where nullptr indicates a branch to the “parent” (leaving the op).
  2. Add a new interface method RegionBranchOpInterface::getSuccessorInputs(RegionSuccessor) that returns a SmallVector<Value> with all successor inputs. Note that it does not matter what the region branch point was.
  3. RegionBranchOpInterface::getSuccessorRegions implementations no longer specify successor inputs. They just specify the target regions.
2 Likes

Yours may be an artificial example, but if it came from a real-world case, there’s a way to bypass this and still have the new property you propose: to use the condition as to which value will be chosen, which also helps users of this op to know which to use.

For your "my_dialect.region_op"(%condition) operation, users have no way of knowing which value was forwarded, and there’s no value for the other one (not even poison). I’d consider such an operation design “broken”.

An alternative would be:

// if-like operation with 2 results but a single yielded value per region.
%idx, %r0, %r1, ... = "my_dialect.region_op"(%condition, ...) (
{  // regionA
^bb0:
  "my_dialect.yield"(%x) : (i32) -> ()  // yield 0 into %idx, A into %r0, poison into %r1
}, {  // regionB
^bb1:
  "my_dialect.yield"(%y) : (i32) -> ()  // yield 1 into %idx, poison into %r0, B into %r1
}, {  // regionC
  ...
}) : (i1) -> (i32, i32)

You can then use a select-like to access the right value, even in a vectorized flow. For instance, if those values are tensors, the %idx tensor could act as a predicates.

No, this one is completely made up. I couldn’t think of any real-world use case, but I’m going to do a pass over the entire MLIR code base to ensure that this pattern does not show up anywhere.

That’s a good observation. Basically, users can change their op design and introduce a second op that “selects” the results in a certain order. This mean that while the RegionBranchOpInterface will loose some expressivity by the proposed design change, users can still achieve the same semantics by introducing a second operation.

You can query that information from the "my_dialect.yield" terminator: RegionBranchTerminatorOpInterface::getSuccessorOperands. This function will return %x for the A terminator and %y for the B terminator.

That’s not necessarily true. The value for the “other one” could be defined by the operation semantics. (Note: Not all op results must be forwarded values.) E.g., the documentation of "my_dialect.region_op" could say something along the lines of:

If the condition is "true", the A region is chosen. In that case, the yielded value
is forwarded to the first op result and the second op result is 42.

If the condition is "false", the B region is chosen. In that case, the yielded value
is forwarded to the second op result and the first op result is 43.

Just because an op result is not a forwarded value (i.e., “successor input”), it does not necessarily mean that it’s value is unspecified. The semantic meaning / value of non-successor-inputs must be specified by the operation itself. Just as for any non-region op: the documentation must define what the semantic meaning of every op result is.

1 Like

Right, the op may specify the other forwarded input is poison or some default value.

From the user point of view, they can still re-evaluate the condition, if the semantics is clear:

  // Producer
  %a, %b = myselect(%cond)
  {
    ^bb0:
      yield %A
  }, {
    ^bb1:
      yield %B
  }

  // User
  if (%cond)
  {
    ^bb0:
      %val = load(%a)
      ...
  }. {
    ^bb1:
      %val = load(%b)
      ...
  }

and the op can still specify what happens if the load(%) operations above are swapped: default / undefined behaviour, trap or something else.

Now, for the replacement proposed, the code doesn’t change much:

  // Producer (sets %first = %cond)
  %first, %a, %b = myselect(%cond)
  {
    ^bb0:
      yield %A
  }, {
    ^bb1:
      yield %B
  }

  // User
  if (%first)
  {
    ^bb0:
      %val = load(%a)
      ...
  }. {
    ^bb1:
      %val = load(%b)
      ...
  }
1 Like

Thanks for the RFC!

I.e., a block argument / op result is a successor input regardless of where the branch originated from.

Do I understand correctly that is equivalent to as if today one were to always specify that ALL block arguments of a region are the successor inputs?

The one op that comes to my mind where this would be incorrect is scf.for, since it has a leading block argument for the induction variable. I.e. it has 1 + |inputs| many block arguments where the first is internally produced by the op and the others are yielded. I believe this was probably the motivating example for the design.

I think ideally we’d have some higher level abstraction (function or a class) that’d be hard/impossible to misuse that’d prevent the errors you described. These kind of user errors are more indicative of a suboptimal API IMO

No, there can still be block arguments that are non-successor-inputs. However, the property of “being a successor input” does not longer depend on the region branch point.

Right, this will still work as usual. Nothing changes here.

However, the following (non-sensical) design would no longer be possible:

  • When entering the scf.for region for the first time, the IV will be first bb arg.
  • When entering the scf.for region for the second, etc. time (coming from scf.yield), the IV will be last bb arg.
1 Like

I see thank you! So its merely about making the concept of successor input the exact same, regardless of the predecessor. I noticed now the second API you added.

Thank you! This makes sense to me.

1 Like

I prototyped this RFC in these two commits:

I found a few places where users of the RegionBranchOpInterface incorrectly constructed a RegionSuccessor with a range of results/block arguments. The previous implementation was incorrect because you cannot the successor inputs unless you query the RegionBranchOpInterface; some call sites simply assumed that all results / block arguments were successor inputs. (This incorrect implementation is likely harmless in practice.)

Somewhat related, this TODO is now dropped, which hinted at a design problem with the current RegionSuccessor API:

  /// Initialize a successor that branches to another region of the parent
  /// operation.
  /// TODO: the default value for the regionInputs is somehow broken.
  /// A region successor should have its input correctly set.
  RegionSuccessor(Region *region, Block::BlockArgListType regionInputs = {})
      : successor(region), inputs(regionInputs) {
    assert(region && "Region must not be null");
  }

I agree that region successors should be a property of the point the control flow is branching to, not of a combination of the from and to points. I checked the commit history on the file and this feature does not seem intentional.