Liveness broken with regions?

Hi all,

I have been looking at MLIR’s Liveness analysis and it seems it is broken w.r.t. operations with attached regions, because it does not consider the implicit control flow between the operation and its regions and between regions . As an example, in the test case (llvm-project/test-liveness.mlir at d480f968ad8b56d3ee4a6b6df5532d485b0ad01e · llvm/llvm-project · GitHub), I’d have assumed that %1 would be in the live out set for the block attached to scf.for, but its not (the test check that that blocks live-out set is empty).

I can think of the following cases that are probably being missed:

  • Any passthrough values that are live across an operation should be in the live in/out set of all nested blocks attached to that operation.
  • If an attached region references an external value (region capture), that value may be live in other attached regions if those regions might execute before it. As an example, a value captured by an scf.while “after” region should be live in/out of the “before” region since the “before” region can execute before the “after” region.

It seems one way to fix this is to make this control flow explicit and have liveness operate not on MLIR Blocks but “virtual” blocks formed by slicing a Block and then running the liveness analysis on these blocks. Essentially, the virtual block just before the operation will own the operand uses (so its use set will include the operands of the operation) and the virtual block just after will own the results (so its def set will include the results). Additional control flow edges will then be added for control flow from the “before” block to all entry blocks of region successors of the op (as defined by RegionBranchOpInterface) and between region exit block and the entry of their successors etc. For operations with attached regions that do not implement the RegionBranchOpInterface we have to conservatively assume that the all possible region->region, op->region and region->op edges are possible (either explicitly, or coalesce all regions into a single “virtual” block and then amend the in/out set of all nested block post analysis to speed up convergence).

There may be other options which still do analysis on MLIR blocks and then propagate liveness in nested blocks post analysis convergence, but seems such a post analysis patch up will have to be conservative.

MLIR seems to have a data flow analysis framework, but I am not sure Liveness is a good client for it. MLIR data flow analysis seems to propagate per-IR-value lattice values, and only support forward data flow for now. Liveness tracks per-block values and is backward.

Thanks,
Rahul

Ping. Any feedback here? I can start prototyping the virtual block scheme above if folks agree with the general direction.

This is interesting: I wonder if the current liveness even makes sense in terms of nesting: it may just be that liveness can’t really be always computed across regions like that?

My understanding is that regions are intended to be a ‘hard’ abstraction boundary that does not interact directly with the code outside the region, or even with code in other regions, with the primary exception being that values from outside the region can be accessed directly inside the region. As a result it doesn’t seem entirely unintuitive for values outside of the region to not really “flow through” the blocks in the region.

Although I could imagine some scenarios when seeing all the values flowing through a region would be useful. Perhaps this means that the current implementation of liveness is computing something interesting, but which is also different? In the absence of detailed information from RegionBranchOpInterface, perhaps the liveins and liveouts of the op containing a region can simply be implicitly considered to be liveins and liveouts of every block in the region?

1 Like

Thanks. I guess it depends on the use case. For operations with IsolatedFromAbove trait I agree that the attached regions form a hard abstraction boundary where the external SSA values cannot be referenced and current liveness analysis which essentially operates within a single control flow graph (i.e., a region) is ok. I was thinking of cases where say that is not the case and we want to build live ranges of SSA values to say assess the fat point (as an example, say we want liveness of memrefs to assess “buffer pressure” to help in scheduling code). It seems with current liveness, the user of liveness analysis has to take that explicitly into account (by potentially drilling down into nested regions).

What would be the liveness of values “passed through” async.execute regions?

  %1 = addi %arg4, %arg5 : i32
  %token = async.execute {
    ... // %1 maybe used in this region, or maybe not
    async.yield
  }
  return %1 : i32

In scf.for case it’s clear that when you reach return op, the nested region “executed”, it’s not true for async regions. Should it impact liveness analysis?

That’s right. Liveness as computed here seems applicable only to sequential execution (so within a CFG/Region but not across). So it seems in general if any sort of cross-region liveness is desired (when meaningful), its the clients responsibility to combine the liveness info across regions into something meaningful for what’s being done.

It feels like the right abstraction here for your problem is indeed a “virtual” block, which falls out naturally from a live interval data structure API like LLVM’s: LLVM: llvm::LiveIntervals Class Reference (this is the data structure used by the register allocator)

Specifically, a LiveRange consists of a set of Segment’s, and segments are subsets of a block. This API also lets you calculate interference between live ranges and other relations betweenthem, which is relevant to your use case, but seems difficult with the current API.

We would need a trait RegionsOnlyExecuteDuringParentOp (or better name) to rule out Eugene’s example for the analysis.

I think liveness as modeled here can only express where an SSA value is live assuming sequential execution semantics. @ezhulenev example is one case where this does not capture actual execution semantics. Another example I could envision is an operation that defines a local lambda with implicit capture and a corresponding call operation. We could not even model this with the current RegionBranchOpInterface and would need special treatment.

Given this flexibility of regions, I am skeptical that it is useful to extend liveness to be nested or at least to make it always nested. A specific analysis that has a constrained model of what regions can do has a much better chance to connect the liveness information across regions when required.

As an example, the buffer lifetime analysis would do this. It also has to take aliasing into account, anyway, to connect the flow of buffers through blocks. At that point, it is not much extra work to implement the nesting of liveness on top of the non-nested analysis.