I think one place where this idea would have significant applicability is when some aspect of the analysis (either the results or some intermediate data structure) is best represented as MLIR ops.
We have something like this in the shape computation world, where the reified shape calculations need to be associated with the values that they describe the shape of (which we can then either just infer shapes with, or maybe keep the shape calculations around and actually compute them).
There are 3 separate approaches that I’ve been able to come up with there:
the “identity-like” approach
%tensor = ...
%shape = ...
%tensor2 = tie_shape_identity(%tensor, %shape) // "identity-like" op that annotates the value.
// %tensor2 should be used instead of %tensor directly.
This approach has the advantage that is is very simple and allows very ad-hoc / unstructured annotations. The disadvantage is getting in the way of use-def chains.
IREE uses this approach when initially materializing shape calculations.
This approach is fraught with peril if one relies on having every %tensor annotated, as all sorts of random canonicalizations/patterns can break that invariant. Effectively, any transformation that introduces new Value’s into the program must be aware of this invariant and repair it, which is extremely hard. In IREE, it got so unmanageable that we only create this form immediately before the one pass that needs it, and erase the annotations immediately afterwards.
The “extra use” approach
%tensor = ...
%shape = ...
tie_shape_extra_use(%tensor, %shape)
// %tensor is used directly
This one is similar to the “identity-like” approach but interferes with use-def chains slightly differently: it introduces extra uses.
Similarly to the previous one, this approach is fraught with peril if one relies on having every %tensor annotated, as all sorts of random canonicalizations/patterns can break that invariant.
The “region” approach
%tensor_outer = tie_shape_region computation={
%tensor = ...
yield %tensor
} shape_calculation = {
%shape = ...
yield %shape
}
// %tensor_outer is used directly
OR
%shape = ...
%tensor_outer = tie_shape_region_capture %shape {
%tensor = ...
yield %tensor
}
// %tensor_outer is used directly
This approach has the advantage that sometimes there is some structure to the annotations that are needed. For example, you might have a fused subgraph, and only the shapes on the boundary of the subgraph matter. Optimizations can proceed inside the fused region with no interference due to the structure. Also, you can merge regions fairly easily in a domain-specific way.
This is similar to the tradeoffs that happen in IREE’s dispatch regions. (we start with identity-like and then convert to this form, essentially).
This approach has the disadvantage that it is bulky if there is no useful structure, and also if you have other structured region-based constructs in your IR that don’t strictly nest with the annotations, then you can’t use the approach at all.
This also ties in somewhat with the still unresolved general desire sometimes to have “attributes on Value’s” (instead of on Operations) which has a similar design space.
The biggest problem with analyses (no matter how you formulate or store their information) is invalidation of the analysis. At some point you have to make some sort of tradeoff between inhibiting a transformation so that the analysis information is not invalidated, leaving the analysis data in an invalid state that you need to somehow know to invalidate/ignore, or having transformations that understand and update it.