How to understand the block concept in MLIR?

In addition, control flow may also not reach the end of a block or region, for example if a function call does not return.

Blocks

I don’t quite understand the sentence, Can you give me an example of a function call that does not return?

However, I believe the text isn’t about not returning but about not calling return explicitly.

In imperative programs, a return is always assumed at the end of a function, so:

void func() {
}

Should lower as:

func.func func() {
  return
}

But the concept of a Block doesn’t have to adhere to such strict rules. You may have a region or a block that does not terminate, and that is ok if the semantics of the block isn’t procedural.

Remember, MLIR can be used for multiple things, not just imperative programming.

Thank you very much for your reply, but I don’t quite understand what you mean.
In my understanding, there may be some special situations where the function is called without returning, such as an infinite loop, but these are all illegal situations.

you said “You may have a region or a block that does not terminate”, but in blcoks , here say

The last operation in a block must be a terminator operation. A region with a single block may opt out of this requirement by attaching the NoTerminator on the enclosing op. The top-level ModuleOp is an example of such an operation which defines this trait and whose block body does not have a terminator."

I think a block that does not terminate, is an illegal situation.

Not quite. As the documentation describes, if you attack the NoTerminator property, the block does not need a terminator. This is the default requirement because it’s rare to have blocks that do not terminate, but it’s not illegal to have blocks that don’t terminate (if they have the property).

Infinite loops are definitely not illegal. There are a vast number of scenarios where a program would never terminate in its core functionality, for example servers and services that only really quit with a special signal, user command or other asynchronous call.

But again, this is not what the property does. Lowering C code into IR would most likely introduce a redundant return instruction to the basic block, even if it never gets called, because this is what the semantics of C-like languages usually are: if you get to the end of the lexical block, the function returns.

The semantics in IR, specifically, which is the root of your question, is just to allow non-C-like lowering to operation semantics that do not need returns, and therefore having a return operation at the end of the block would be illegal in that language semantics.

One example is to use MLIR to represent graph-based semantics, not procedural SSA-based semantics. In this case, there is no control flow, and return is a concept that does not exist. Forcing blocks to have one would be wrong in this case.

Another example, imagine a “State based” language (I once created one of those), where the code operates on shared data and there is no argument passing. Control flow is controlled externally, and each “state” reads and writes into “global” values. Which state is next is not controlled by the state code, but by an external scheduler looking at that shared data.

In the case of static code generation, you could “inline” all of those state codes, one after the other, into a mega function. If the state code had returns, you’d have to validate that each return goes in the right place, essentially replacing them with gotos and whatnot. So you create a design where control is defined by changing the shared state and at the end of the code, the next state kicks in. No returns necessary.

I’m sure there are other examples where control flow is not well represented in IR, or not at all necessary or meaningful, and having the ability to not force that is a nice feature of MLIR basic blocks.

However, as you may have guessed, they’re quite rare in the real world, but not quite illegal.

Here’s an even clearer example: The ModuleOp does not have a terminator because it makes no sense at all to “return” from a module.

Blocks are not just code, they’re structure too, and that’s why it’s necessary to have the NoTerminator property.

You’re confusing static and dynamic properties here: “have a terminator” is a static property of the IR, whereas when we use the word “terminates” as in “does this block terminate?” we are refer to a dynamic property (like “is there an infinite loop?”) where the link Renato gave on program termination is very relevant.

1 Like

In my understanding, graph regions does not define the CFG semantics, so our discussion of control flow should be limited to SSACFG regions. Obviously, The ModuleOp only has Graph region.

I’ll think about the “State based” language example you said in detail. Very thanks for your reply.