Dynamic control flow (break-like operation)

mscuttari · December 30, 2020, 12:38pm

I’m trying to implement a “break”-like operation for SCF’s while loops.
First of all: is this a feature that is already in plan to be developed? I only found the following statement in the source code, but it’s not clear if it is just a reminder or an effective planned feature:

// Replace terminators with branches. Assuming bodies are SESE, which holds
// given only the patterns from this file, we only need to look at the last
// block. This should be reconsidered if we allow break/continue in SCF.

Assuming that’s not in current plans I’ve tried to implement it by myself, but I’m facing some design issues upon which some more experienced people may give some advices.
Let’s say we have a scf::WhileOp with a myDialect::BreakOp arbitrarily nested inside one of the blocks of the while body region. When I lower the SCF dialect to the Standard one, I lose the information about the existence of the loop itself, and when it comes to lower the break operation I’ve no clue of where to jump to.

module  {
  func @main() -> () {
    br ^bb1
  ^bb1:  // 2 preds: ^bb0, ^bb2
    %true = constant true
    cond_br %true, ^bb2, ^bb3
  ^bb2:  // pred: ^bb1
    myDialect.break <-- Should be replaced with a branch to ^bb3 but I can't find a way to determine the destination branch inside my rewrite pattern
    br ^bb1
  ^bb3:  // pred: ^bb1
    return
  }
}

To avoid this issue I was thinking of a priori creating a continuation block when I first encounter a while operation, but this is seems a bit odd, especially if no break statements are then encountered while lowering the while body. Furthermore, I think the problem would persist, because when creating the Break operation I would have to determine which block is the continuation of the while operation.

module  {
  func @main() -> () {
    scf.while : () -> () {
      %true = constant true
      scf.condition(%true)
    } do {
      scf.yield
    }
 br ^bb1
^bb1:
    return
  }
}

mscuttari · January 4, 2021, 9:44am

I made different trials and in the end the winning one was to redefine If, For and While operations inside my dialect. Basically I had to incorporate all the SCF dialect (with some modifications) within mine.
The reason for this is that the SCF operations expect a Yield operation at the end of their blocks, which is not always the case when introducng break statements. For example, a Break operation can also be placed as last operation (a bit degenerated code, but still it is legit in my language).:

while {
  ...
  scf.condition %cond
} do {
  break;
}

As for where to jump, I solved it by adding a 3rd region in the While op (the same can be done for the For one), which will contain 1 block with just a 1 branch instruction to the continuation block (to be more precise, at first it contains just a Yield operation that is converted in a Branch when splitting the parent block and inlining the operation regions). The inner break statements will then jump to that block. This last information is first determined when creating the Break operation by going up in the operation hierarchy until I found an operation with my BreakableLoop trait, then getting its “3rd region continuation block”, and then storing its pointer within the Break operation itself as successor.

ftynse · January 5, 2021, 10:08am

It says “if we allow” not “when we allow” I’m not aware of anybody working on this. I’ve seen some requests along these lines, but so far, everything could be instead transformed into a while containing an if.

In general, I would claim that having break as a non-terminator violates SSACFG region convention. (Well, unless one defines break as some indication for the actual terminator where to branch, but that’s just weird semantically). Assuming it should be a terminator, you cannot just put it into a random dialect without letting SCF ops know about its semantics. So, as you discovered, you’d need to either modify SCF ops to be aware of new terminators or duplicate them.

I am not sure I could follow your design with traits and a third region, it looks complicated without an example. Be aware that MLIR semantics does not allow one to branch directly to a block in a different region or branch to the entry block of the region (it seems the break you propose requires both), although there are ways to make it look as if it was allowed.

IMO, if we do want to support break/continue, we first need to relax the single-block body requirement in all loops. Then, each block can be potentially terminated with an scf.yield, which becomes an equivalent of continue. At this point, we can also consider an scf.break as an additional block terminator or repurpose scf.condition. This may have an effect on some transformations that expect single-block loops or that scf.for has a computable number of iterations, and requires careful consideration.

clattner · January 5, 2021, 4:47pm

Right. MLIR in general supports modeling AST-like constructs and break like things, but it isn’t clear to me that scf.while does. SSA (bbargs etc) needs to know CFG edges, and a break operation won’t provide that.

I’d recommend defining a new while loop op for your dialect that is intended to be used in an AST-like way.

mscuttari · January 8, 2021, 3:22pm

I try to explain a bit better what I did, so that you can spot any bad practice.
My WhileOp defines 3 regions, differently from the 2 defined by SCF’s WhileOp. The new one is supposed to contain a single block (I will call it “exit block”), and inside this block there is just a branch that will point to the block the SCF’s while would normally continue when the condition is false (the “continuation block”).
While processing my frontend while body, if I encounter a BreakOp I start walking up in the operations tree until I find an operation marked with my BreakableLoop interface (I used Traits at first, but I recently modified it to be an interface), which gives information about where that “exit block”. My WhileOp is indeed marked with this BreakableLoop interface.
The 3 regions are then inlined in the same exact way it is done when converting the SCF’s ops to the standard dialect, and in the end there is no branch to other region’s blocks.
I agree on the problem of single static assignment, also pointed out by @clattner, but in my case this is trivial because I only work with loads / stores, and thus I don’t allow any result coming out from the loop. By the way, @clattner, did you mean this with “AST-like way”?

Forgot the example. In the end, the IR results in something like this:

module  {
  func @main() -> () {
    br ^bb1
  ^bb1: // "before" block
    ...
    cond_br %condition, ^bb2, ^bb3
  ^bb2:  // "after" block
    // while body here.
    // if any break is encountered, it will be lowered to branch to ^bb3
    br ^bb1
  ^bb3:  // "exit" block
    br ^bb4
  ^bb4:
    return
  }
}

ftynse · January 11, 2021, 2:59pm

I still would like to see an example of the operation rather than the IR it lowers to. The latter is trivial, the former is not.

One issue I see with this is that is assumes there is some continuation block that can be branched to. Normally, when using SCF, one doesn’t see blocks at all.

Another potential issue (and I cannot confirm if it is the case or not without seeing an example) is branching to a block defined in another region. This is invalid. I realize that multiple rewritings of the spec made it unclear, but a terminator is allowed to transfer the control flow either to another block in the same region or to the parent operation of the region. The parent op may transfer the control to some other block in the same region as it is located itself, but it must be a terminator for this.

mscuttari · January 14, 2021, 8:43am

While operation:

void WhileOp::build(mlir::OpBuilder& builder, mlir::OperationState& state)
{
	auto insertionPoint = builder.saveInsertionPoint();

	// Condition block
	builder.createBlock(state.addRegion());

	// Body block
	builder.createBlock(state.addRegion());

	// Exit block (for break operation)
	builder.createBlock(state.addRegion());
	builder.create<YieldOp>(state.location);

	builder.restoreInsertionPoint(insertionPoint);
}

mlir::Region& WhileOp::condition()
{
	return getOperation()->getRegion(0);
}

mlir::Region& WhileOp::body()
{
	return getOperation()->getRegion(1);
}

mlir::Region& WhileOp::exit()
{
	return getOperation()->getRegion(2);
}

Break operation creation (inside WhileOp’s body region):

WhileOp whileOp = ... retrieve parent op by navigating hierarchy tree ...
builder.create<BreakOp>(location, &op.exit().front());

Then, when lowering, condition, body and exit’s blocks are all inlined in the WhileOp’s parent region (the same as in SCF dialect lowering)

ftynse · January 14, 2021, 3:32pm

I am sorry if I am not making it clear enough, but by an example of operation I mean how the IR of this operation looks like, as if you were writing documentation with examples for it. What you posted is code for constructing it. It is easy for me to visualize the code necessary to construct such operation or to convert it to LLVM given the operation itself.

Here’s what I want using SCF “while”, taken from the dialect documentation:

%res = scf.while (%arg1 = %init1) : (f32) -> f32 {
  /* "Before" region.
   * In a "while" loop, this region computes the condition. */
  %condition = call @evaluate_condition(%arg1) : (f32) -> i1

  /* Forward the argument (as result or "after" region argument). */
  scf.condition(%condition) %arg1 : f32

} do {
^bb0(%arg2: f32):
  /* "After region.
   * In a "while" loop, this region is the loop body. */
  %next = call @payload(%arg2) : (f32) -> f32

  /* Forward the new value to the "before" region.
   * The operand types must match the types of the `scf.while` operands. */
  scf.yield %next : f32
}

mscuttari · January 14, 2021, 4:10pm

Oh sorry, here it is. I thought the IR itself could not be enough having the possibility to define a custom print method.

mydialect.while {
  %condition = ...
  mydialect.condition %condition
} do {
  %condition = ...

  mydialect.if %condition {
    mydialect.break
  }

  mydialect.yield
} exit {
  mydialect.yield
}

Feel free to ask for more if not clear enough, being new in LLVM & MLIR I may sometimes not understand correctly

ftynse · January 14, 2021, 4:38pm

Custom syntax is actually quite constrained so most things will be obvious. Other things can be explained using comments.

As I suspected, your approach does not play well with the general semantics of MLIR.

mydialect.while {
  %condition = ...
  mydialect.condition %condition
} do {
  %condition = ...

  mydialect.if %condition {
    // The operation below cannot break out of the "while", it can
    // only transfer the control flow to its immediate ancestor, "if".
    mydialect.break
  }
  // The "if" operation above not being a terminator of its block,
  // it is not allowed to transfer control flow at all. Any general pass
  // will legitimately assume the operation below executes regardless
  // of "break" being executed before.
  call @foo() : () -> ()  // this is _always_ called

  mydialect.yield
} exit {
  // Independently of everything above, I don't see the point of having
  // this region. It seems to only serve as an anchor for not creating
  // a block when lowering this to CFG. One can just split the block
  // containing the "while" and branch to the second new block which
  // contains the rest of the operations.
  //
  // An extra block can be useful in Python-style "for ... else" where
  // it is executed iff the loop did not break out.
  mydialect.yield
}

See MLIR Language Reference - MLIR

Terminator operations without successors can only pass control back to the containing operation.

I’ll update the wording to make sure it says the “first containing” operation, which got lost in rewrites.

mscuttari · January 14, 2021, 4:56pm

Break actually has a successor, and it is the block of the exit region. I forgot to print it, I’m sorry:

mydialect.while {
  %condition = ...
  mydialect.condition %condition
} do {
  %condition = ...

  mydialect.if %condition {
    mydialect.break ^bb0 // bb0 is the (only) block of the exit region
  }

  mydialect.yield
} exit {
  mydialect.yield
}

The existence of the exit block is to have an anchor for the break operation, as you mentioned. When I encounter a break operation to be lowered, I determine its jump destination in this way:

mlir::Operation* loop = builder.getInsertionBlock()->getParentOp();

while (!loop->hasTrait<BreakableLoop::Trait>())
	loop = loop->getParentOp();

auto location = loc(statement.getLocation());
auto op = dyn_cast<BreakableLoop>(loop);
builder.create<BreakOp>(location, &op.exit().front());

BreakableLoop is an interface of WhileOp

ftynse · January 14, 2021, 5:32pm

One is not allowed to branch to the entry block of the region (MLIR Language Reference - MLIR)

The entry block cannot be listed as a successor of any other block

Furthermore, successors of a block must be within the same region.

Regions provide hierarchical encapsulation of programs: it is impossible to reference, i.e. branch to, a block which is not in the same region as the source of the reference, i.e. a terminator operation.

mscuttari · January 14, 2021, 6:17pm

Ok, I get it.

I was rethinking about this.
Let’s say that the loop body contains an IfOp, and inside the IfOp’s body there’s the break condition (as in my last example). Let’s also suppose that the exit block is created on purpose as last block in the loop body region, so that it’s not the entry block of a region. I can’t understand how the break operation can have some knowledge about where the parent loop finishes without telling it to have that special block as successor, which although is illegal being it outside the IfOp body region.

EDIT: I did see later your message change. So it seems like an impossible objective to reach, isn’t it?

ftynse · January 14, 2021, 6:43pm

I was thinking along these lines

scf.breakable_while {
  // This op is a terminator, so it is allowed to branch within its region.
  scf.break_if {
    %0 = // compute the condition
    // This terminator returns control to "break_if" together with
    // the associated value.
    scf.condition %0
  // "break_if" being a terminator itself, it can transfer control further.
  // Depending on the associated condition value, it either transfers it
  // to the indicated block, or returns the control to its immediate parent
  // "brekable_while". The latter can then decide whether to execute
  // the next iteration (transfer control flow to the entry block) or
  // stop (transfer control flow to the naturally following operation).
  } else br ^bb1

^bb1:
  // normal block
  scf.break_if {
    %1 = // compute another condition
    scf.condition %1
  } else br ^bb2

^bb2:
  // just keep looping
  scf.yield
}

mscuttari · January 14, 2021, 7:43pm

Sorry if I bother you, but although I understand the principle of your code, I find it very restricting in term of usage of a break-like operation. That would be good to go only if the break is inside an If and furthermore as its only body operation. But apart from this, which maybe can be solved, it would not also allow arbitrary nesting in a C-like style.
Somethink just like this, when converted to MLIR, wouldn’t be satisfiable (or at least I think so, maybe I’m wrong):

// C-like code:
while (condition) {
    if (x > 0)
        if (y > 1) {
           x = y;
           break;
        }
}

I know that it is forbidden to give to an operation a successor that is outside its region, but right after the ML IR generation, my dialect would be converted to the standard one and, because of regions inlining, the branch would then refer to a block that will be in the same region, thus making the IR legal again. Is this a real issue? Because what I need to achieve is an arbitrary nested break like this last example, and I can’t afford to restrict the language capabilities to special subcases (such as a “break if” scenario).

ftynse · January 15, 2021, 11:11am

We design IRs so that they are easy for the compiler to reason about, being easy for a human to write is a secondary concern. FWIW, this is why we have IRs in the first place: it’s hard for an optimizer to reason about C.

On a less meta point, in MLIR, we can have control flow either structured with regions or as a CFG. The naive way of implementing “break” is doing “goto label-after-the-loop”. This is totally fine if the IR is already a CFG, but it makes life significantly harder if it is mixed with regions. Now one can no longer consider a region in isolation but rather needs to keep track of all goto targets. It is totally possible using common CFG approaches, but at this point we will be just adding unnecessary complexity of dealing with regions to those approaches. I see very little value in mixing the two (although I can be convinced otherwise if there are transformations we can do on mixed form that we wouldn’t be able to do otherwise).
Furthermore, MLIR needs to reason about opaque, undefined operations with regions and control flow so we need at least some general rules on how the control is transferred that we encoded in regions.

You can express that perfectly well in numerous ways. Specific choices depend on what you need to do with it. Chris was mentioning an AST-like approach where you can literally have

ast.while (%condition1) {
  ast.if (%condition2) {
    ast.if (%condition3) {
      ast.break
      ast.implicit_terminator
    }
    ast.implicit_terminator
  }
  ast.implicit_terminator
}

as long as you don’t need MLIR to understand the control flow. In this case, the semantics of the “ast” dialect are to declare that an AST node exists, operations are executed in normal sequential order (there’s actually only one control flow path through the declaration) and as a result you get a declaration of the AST. This AST is transformable and can also be “code generated” through dialect conversions to operations that have understandable-to-MLIR control flow. Which operations specifically depends again on what you want to do with them.

For all practical purposes, one can rewrite the following (I discarded your example because loop exit conditions can be trivially folded into “while”)

while (cond1) {
  f1();
  if (cond2) {
    f2();
    if (cond3) {
      f3();
      break;
    }
    f4();
  }
  f5();
}

into

bool mustBreak = false;
while (cond1 && !mustBreak) {
  f1();
  if (cond2) {
    f2();
    if (cond3) {
      f3();
      mustBreak = true;
    }
    if (!mustBreak) f4();
  }
  if (!mustBreak) f5();
}

by predicating every statement potentially executed after at least one break on the absence of break. This can be, for example, a rewrite on the hypothetical AST dialect. It can be then directly represented with SCF through memory:

%true = constant true
%false = constant false
%c0 = constant 0 : index
%mustBreak = alloca : memref<1xi1>
store %false, %mustBreak[%c0] : memref<1xi1>
scf.while {
  %cond1 = ...
  %break = load %mustBreak[c0] : memref<1xi1>
  %stop = and %cond1, %false : i1
  scf.condition %stop
} do {
  call @f1()
  %cond2 = ...
  %outer = scf.if (%cond2) {
    call @f2()
    %cond3 = ...
    scf.if (%cond3) {
      call @f3()
      store %true, %mustBreak[%c0]
    }

    %break = load %mustBreak[%c0] : memref<1xi1>
    scf.if (%break) {
    } else {
      call @f4()
    }
  }
  %break = load %mustBreak[%c0] : memref<1xi1>
  scf.if (%break) {
  } else {
    call @f5()
  }
}

which can be further rewritten to use loop-carried values with some variant of mem2reg:

%true = constant true
%false = constant false
scf.while (%mustBreak = %false : i1) {
  %cond1 = ...
  %stop = and %cond1, %false : i1
  scf.condition %stop, %mustBreak : i1
} do {
^bb0(%arg0: i1):
  call @f1()
  %cond2 = ...
  %outer = scf.if (%cond2) -> i1 {
    call @f2():
    %cond3 = ...
    %inner = scf.if (%cond3) -> i1 {
      call @f3()
      scf.yield %true : i1
    } else {
      scf.yield %false: i1
    }
    scf.if (%inner) {
    } else {
      call @f4()
    }
    scf.yield %inner : i1
  } else {
    scf.yield %false : i1
  }
  scf.if (%outer) {
  } else {
    call @f5()
  }
  scf.yield %outer
}

This doesn’t break our SCF analyses and transformations, that rely on there being single block and static control flow, but gives you the desired expressivity on another level. You can also go to CFG early by defining appropriate operations, or extend my previous sketch by making break_if also “breakable” in a way that it communicate the break request up to the first loop.

mscuttari · January 15, 2021, 12:30pm

Again thanks for your answers, very appreciated.

ftynse:

Chris was mentioning an AST-like approach where you can literally have
ast.while (%condition1) {
  ast.if (%condition2) {
    ast.if (%condition3) {
      ast.break
      ast.impllicit_terminator
    }
    ast.impllicit_terminator
  }
  ast.impllicit_terminator
}
as long as you don’t need MLIR to understand the control flow. In this case, the semantics of the “ast” dialect are to declare that an AST node exists, operations are executed in normal sequential order (there’s actually only one control flow path through the declaration) and as a result you get a declaration of the AST. This AST is transformable and can also be “code generated” through dialect conversions to operations that have undestandable-to-MLIR control flow. Which operations specifically depends again on what you want to do with them.

Just a small and stupid question: for “AST-like” do you mean “1 region only”? Because otherwise I don’t see how that would differ from the WhileOp I defined.

As for the rest I’ve understood the point and maybe it’s better to first elaborate the AST I get from the source, in order to remove break statements, and then rely on the already defined SCF dialect rather than redefining it by myself.

ftynse · January 15, 2021, 3:50pm

I mean the semantics of the operations in this dialect is to define an AST without relying on and opaquely to MLIR control flow representation. (One can imagine actual MLIR control flow around the AST-defining ops to define a larger AST programmatically, in a sort of multi-staged programming mode).

Your approach was to extend or build on SCF, which is exactly about MLIR-visible control flow.

In general, we advise users to enter MLIR as early as possible in their flow. For the choice of intermediate abstractions, let me reiterate: it depends on what you want to achieve. If you just need to go from whatever higher-level abstraction down to LLVM IR, weigh the trade-offs between implementing the conversions necessary to target SCF so as to leverage conversion to CFG and implementing the direct conversion to CFG and/or defining another intermediate abstraction. SCF gives you some transformations and the ability to target certain other dialects (GPU, OpenMP if you have parallel operations), so this may tip the scale towards using if relevant.

mscuttari · January 15, 2021, 5:00pm

So the ast.break operation should have no successor attached to it? Is this the difference with respect to my solution?

Maybe I was unclear, but the lowering of myDialect.WhileOp to the standard dialect happens with my own defined rewrite pattern and does not lower to SCF dialect. To do that I took inspiration from how the SCF dialect and ScfToStd conversion patterns work, but I have my independent conversion pattern that knows nothing about what SCF is. To be even more clear, I also have my own myDialect.yield, myDialect.condition, etc. In this sense I can’t understand what more should I do to define an AST-like operation. Is there any special trait to tell MLIR not to look inside its region’s CFGs?

ftynse · January 18, 2021, 12:04pm

None of the operations should have explicit successors, they should not be using MLIR’s control flow modeling. Similarly, I don’t expect them to use things like RegionBranchOpInterface or other related things.

I think the mention of “ast-like” ultimately confused you more than it helped… There is no special support for anything “ast-like”, nor is it necessary at this point. The important part is the semantics of the operation, that is, what does it mean for a program to reach such an operation or, if we were to write an interpreter for this dialect, what would it do when encountering such an operation. In my approach, the semantics is to just take note that a “while” construct exists and move on, there’s no attempt to execute it, yet. In SCF, the semantics is to actually iterate. It is unclear what is yours and since it seems to be derived from SCF, a natural assumption that it is also to actually iterate.

Topic		Replies	Views
SCF dialect vs CFG MLIR	1	1300	May 25, 2021
[RFC] Add scf.while to scf dialect MLIR	25	1740	November 6, 2020
[RFC] Region-based control-flow with early exits in MLIR MLIR	28	1467	October 24, 2024
'scf.while' op expects region #0 to have 0 or 1 blocks MLIR clang , mlir	1	120	May 28, 2024
Patch: Implement break and continue Clang Frontend	3	83	July 16, 2007

Dynamic control flow (break-like operation)

Related topics