Many transformations on higher-level ops have multiple parameters and, according to recent research, can benefit from a precise per-op or per-op-class configuration of these parameters. Traditionally however, these transformations are controlled by passes heuristics with few control knobs.
We propose a mechanism for more precise control over IR transformation by reifying the sequence of IR transformation instructions as more IR.
Transform commands as IR
Each transformation corresponds to an operation in a separate piece of IR, referred to as Transform IR. Attributes of Transform IR operations specify the parameters of the transformation. These operations can define and use values that serve as handles to operations in the IR being transformed (or Payload IR). Such handle values allow for precisely targeting a transformation on an operation or a set of operations.
At a high level, Transform IR operations are expected to implement an interface along the following lines:
class TransformOpInterface {
virtual LogicalResult apply(TransformState &state) = 0;
};
/// Contains the mapping between handle values used in Transform IR and lists of
/// corresponding Payload IR ops.
struct TransformState {
/// Must be called to indicate that the handle values defined by the Transform IR op
/// correspond to the list of Payload IR ops.
void setPayload(OpResult handle, ArrayRef<Operation *> payloadIROps);
/// Returns the list of Payload IR ops associated with the handle value
/// used in Transform IR.
ArrayRef<Operation *> getPayload(Value operand) const;
};
The simplest way to use such Transform IR operations is to create a “schedule interpreter” that calls “apply” following some predefined rules, e.g., sequentially. This is similar to the approach adopted by Halide, TVM, etc.
In addition to the interface, a trait can provide the apply
implementation for the common case of a transformation accepting one handle and producing another handle.
class SingleOpTransformOpTrait : public OpTrait<...> {
LogicalResult apply(TransformState &state) {
ArrayRef<Operation *> payload = state.getPayload(getOperation()->getOperand(0));
SmallVector<Operation *> results = to_vector(map_range(
payload,
[](Operation *payload) {
return static_cast<ConcreteOp *>(this)->applyToOne(payload);
}));
state.setPayload(results);
return success(/*all-results-are-non-null*/);
}
};
Similar traits can be provided for other common cases on a per-need basis.
Extension mechanism
TransformationState
can be extended to store more information than the mapping or to get notified when the mapping changes by attaching instances of the state extension class.
class TransformOpInterface {
class Extension {
/// Derived classes can have non-trivially-destructible members.
virtual ~Extension();
/// Derived classes can override this to get notified when the payload changes.
virtual void notifySetPayload(Value handle, ArrayRef<Operation *> payloadIROps);
};
/// Adds a new extension of the given type to the state.
/// Practically this constructs the extension from arguments.
template <typename ExtTy, typename... Args>
ExtTy &addExtension(Args &&... args);
/// Returns the extension of the given type if present in the state or null.
template <typename ExtTy>
ExtTy *getExtensionOrNull();
};
Extensions are identified by TypeID
and only one extension of a given type is allowed in the state. Transform IR operations can interact with extensions at will and are allowed to fail if the required extensions are missing.
Relation to the infrastructure
PDL is expected to be used to match the Payload IR ops to be transformed. Furthermore, !pdl.operation
can be used, at least initially, as a handle type. More specifically, we expect to introduce a transform op that takes as attribute the symbol reference to a PDL pattern and defines a handle
pdl.pattern @pdl_pattern {
%0 = operation "scf.for"
// …
rewrite %0 with @xform
}
%0 = xform.pdl_match @pdl_pattern
%1 = xform.tile %0 {tile_sizes = [32, 32]}
The recently proposed dialect extension mechanism can be used to reduce coupling between the dialect that defines Transform IR ops and the implementations of those transforms by leveraging external models for the transform op interface.
This mechanism is complementary to passes: a pass may run a “schedule interpreter”, individual transformations can create and run (nested) pass managers. A simple sequential “interpreter” test pass will be defined for testing purposes.
Alternatives considered
Finer-grained passes with multiple options
One could consider writing finer-grain passes that perform a specific transformation on a highly specific shape of the IR. It sounds extremely difficult to achieve the same level of both expressivity and verification with pass options as with PDL + custom verifiers on ops. Furthermore, it would be challenging to chain transformations, i.e., have a transformation apply on the ops produced by the previous one or have its properties depend on the success of the previous transformations, as there is no mechanism to communicate between passes.
Attributes
One could consider attaching discardable attributes to operations as means to communicate between transformations. As their name indicates, such attributes can be discarded at any moment and are not reliable for anything but hints. Some enabling transformations such as canonicalization or CSE do drop attributes.
Driving everything from PDL
Another possibility is to drive the transformation using custom rewriters from PDL, where the custom rewriter would play the role of the apply
interface method. This has similar issues with chaining and verification as with passes.
Layering
The current plan is to create a new dialect, xform
or transform
, that would contain the interface and the basic operations for bootstrapping the Transform IR such as pdl_match
, container ops, utilities for executing the transforms, and potentially Transform IR ops for some general transformations such as canonicalization and folding. This is similar to how the bufferization dialect is organized.
Dialect-specific transformations can be implemented by their respective dialects, e.g., with ops prefixed by .transform
such as scf.transform.tile
and with external / promised interface mechanism to decouple transformation implementation from the op definition. Cross-dialect yet non-generic transforms can follow the same approach as cross-dialect canonicalization patterns (while this may be challenging, solving the layering issue for cross-dialect canonicalization patterns is not in scope of this RFC and the controllable transform mechanism should align on any change proposed for cross-dialect canonicalization).
Other proposals for layering and naming are welcome!