[RFC] OperationInstancesInterface (or any better name?)

This came up in https://reviews.llvm.org/D90716 (@ftynse), for automatic reference counting it is required to know how many instances of an operation will be at runtime (how many times operation will be executed).

Example #1: operation has exactly one instance at runtime (== known statically)

func @fn0() {
  "operation"(): () -> ()
  return
}

Example #2: operation has dynamic number if instances (== not known statically)

func @fn1(%cond : i1) {
  conv_br %cond ^bb1 ^bb2
^bb1:
  "operation"(): () -> ()
  br ^ret
^bb2:
  ...
  br ^ret
^ret:
  return
}

I propose to add OperationInstancesInterface interface to ControlFlowInterfaces with this API:

/// Returns the statically known number of instances for the operation
/// `op` or returns None if it is not know statically.
///
/// Operation `op` must be in one of the regions owned by this operation.
Optional<unsigned> getStaticallyKnownNumberOfInstances(Operation* op) {
  assert(op.getParentOp() == this);
  ....
}

Every implementation will have to check that block that contains op is dominated by the entry block.

Any suggestions to use different terminology for this interface?

I don’t really understand what this interface implementation would look like for a variety of operation right now, can you explain a bit more?

I’m a bit puzzled by the scope at which it applies (is this only in the context of the current region? The current isolated region?) and what does the returned value semantics is exactly?

I found the code in https://reviews.llvm.org/D90716 ; as I understand it, it seems like you’re expecting this interface to return the number of time the region it would hold would be executed?
It seems like it can’t handle multiple regions, and it isn’t clear to me how it’d handle a CFG as well.

The property you are looking for sounds like a combination of a property on a RegionBranchOpInterface op combined with a general CFG analysis that is independent of any particular op.

That is, I think this property is best phrased as “number of times a region is executed” and “how many times a block within a region is executed”.

I think for the common case, you can just use RegionBranchOpInterface, build the region successor graph, and then return numbers based on that. For ops with a single region that just gets executed once, the analysis degenerates to a very simple check, and we can leave a TODO for a more elaborate analysis (e.g. perhaps it is useful to know that for scf.if, the ‘then’ and ‘else’ regions execute exactly 0 or 1 times, though we don’t know which; in general, for an acyclic region successor graph, it should be possible to calculate a fixed set of possible execution counts).

“how many times a block within a region is executed” is exactly what I’m interested in (all ops in the block are executed).

Yes, not in generic way, without knowing the semantics of the op it self. For imaginary op:

scf.for_loop_with_epilogue %i = ... {
 ... loop body ...
} and_then {
 "operation"()
}

Only the scf.for_loop_with_epilogue can tell that “operation” (or block that contains that op) will be executed exacly once.

If the CFG is sequential (single statically known path to the block from the entry block) without conditional branches or loops, then it’s easy, if it has conditional branches then the number of times block is executed is generatlly unknown (though can be inferred if conditions/loop iterations are known statically).

Because semantics of the region defined by the op implementation, I’m not sure that it is possible to answer this question without some help from the op. With scf.for as an example, it is possible to compute number of executions if the loop bounds are constants and loop body does not have dynamic control flow, but in general this number is unknown.

There are two aspects here:

  1. How many times a block is executed in a particular CFG Region: this is independent from the enclosing op I believe and fairly generic in terms of CFG analysis (it gets complicated with custom terminators though).
  2. How many times the region is invoked by the enclosing op.

Yes, good point.

Should the #1 and #2 be separate API calls? Or #1 is an implementation detail of the interface and the only public API is NumberOfExecutionsInterface::getNumberOfExecutions(Operation* op).

If #1 is an implementation detail, then each individual op can pass some additional information about custom terminators semantics in the attached regions to the generic CFG analysis pass.

In terms of interface, I suspect that the query should likely be for an operation and a region: NumberOfExecutionsInterface::getNumberOfExecutions(Operation *, int region) or something like this.

I’m a bit concerned though about the async refcount at the moment: that restricts the applicability significantly if we’re limited for correctness to programs where the instance count is entirely static.

Dynamic number of executions is also handled in async ref counting, it’s just a bit more complex:

  1. The first (inner most) block with dynamic number of executions does its own add_ref with count=$knownStaticExecutions
  2. Dynamic op successor at the same level as async value producer has explicit drop_ref operation.

There are few examples in async-ref-counting.mlir test.

I don’t think we even need to have a new interface. RegionBranchOpInterface can just have an optional method void getNumRegionInvocations(ArrayRef<Attribute> operands, SmallVectorImpl<int64_t> &oneCountPerRegion). The reason for a batched interface is that in the multi-region case, there is likely some analysis shared across the regions. A region with unknown count returns a count of -1.

The ArrayRef gives attributes corresponding to constant operands (just like getSuccessorRegions does, which allows constant cases to be handled with more precision, example of scf.if which gives only one possible successor in the constant case)

https://reviews.llvm.org/D90922 implements this approach + adds NumerOfExecutions analysis that does some trivial CFG analysis, basically it rejects all CFGs with conditional branches or loops.

Just for more context, I needed a similar feature for a custom mem2reg pass.

cc @wsmoses