‘Target’ refers to a specific configuration or environment that implementations aim to support or comply with. In the context of TOSA, it encompasses three main aspects of the specification: Profiles, Extensions and Levels. In the near future, the specification version can also be considered.
Motivation
Currently, generated TOSA IR is checked for conformance to a target via the validation pass, e.g. --tosa-validate="profile=pro_int extension=bf16 level=8k".
However, transformations currently have no knowledge about the users intended target. This can lead to:
- Previously conformant IR being transformed into non-conformant IR.
- Non-conformant IR not being transformed into conformant IR.
Where problem 1 is the primary motivator for this proposal. Some motivating examples include:
- Canonicalizing pad+conv2d into conv2d operation can depend on the intended “level”. See conversation in [mlir][tosa] Fold PadOp to tensor operations by GeorgeARM · Pull Request #132700 · llvm/llvm-project · GitHub .
- Decomposing a large concatenate operation into multiple concatenates where “level” constraints were previously exceeded.
- “TosaOptionalDecompositions” pass may want to query available profiles/extensions before decomposing operators.
Prior art
spirv MLIR dialect
Has a spirv.target_env module attribute that can specify - capabilities, extensions, resource limits, and version required for the target device. e.g.:
module attributes {
spirv.target_env = #spirv.target_env<
#spirv.vce<v1.3, [Int8, ...], [...]>, #spirv.resource_limits<>>
} {
...
}
which can be queried by transformations using “lookupTargetEnvOrDefault”, e.g.:
auto targetEnvSupportsKernelCapability = [](gpu::GPUModuleOp moduleOp) {
Operation *gpuModule = moduleOp.getOperation();
auto targetAttr = spirv::lookupTargetEnvOrDefault(gpuModule);
spirv::TargetEnv targetEnv(targetAttr);
return targetEnv.allows(spirv::Capability::Kernel);
};
Target information is attached to a GPUModule using --spirv-attach-target e.g.:
$ mlir-opt --spirv-attach-target="module=spirv.* ver=v1.0 caps=Kernel" test.mlir
gpu.module @spirv_module_1 [#spirv.target<#spirv.vce<v1.0, [Kernel], []>, #spirv.resource_limits<>>] {...}
DLTI dialect
DLTI is a dialect in upstream MLIR that allows for the creation of device target descriptions. However, it seems to be a work in progress (happy to be corrected): Next steps on target descriptor · Issue #934 · libxsmm/tpp-mlir · GitHub .
Target information in TOSA
Proposal 1
Pass target parameters to each pass that depends on target information (the current method for the --tosa-validation pass).
Example
Based on the motivating example above: “Canonicalizing pad+conv2d into conv2d operation required the intended “level” being taken into account”.
The transformation can be pulled out of “canonicalizations”, and moved into a separate, optional, transformation pass. The pass can expose a “levels” parameter, which it will take into account while applying the transformation.
Pros
- Fine granularity control for exposing target information
- Exposing information globally may lead to over-dependence on the functionality / incorrect usage.
- Separation of concerns
- Supplying target information to transformation passes keeps the IR clean of target-specific metadata.
- Each pass is therefore self-contained, without relying on some global state.
Cons
- Target dependent optimizations might need to become a separate pass.
- Duplicated parameters for each target dependent transformation pass.
- Can create a burden on the user repeating target information each time.
- Risk of inconsistencies.
- Maintenance overhead when target information is changed/extended.
Proposal 2
Attach target information to the module scope. Target information is attached once, globally, and therefore is accessible to all optimization passes. Takes inspiration from the spirv.target_env attribute.
A pass, --tosa-attach-target, can be provided to attach target env to the module via command-line arguments (similar to what exists for --tosa-validate today).
Example
Based on the motivating example above: “Canonicalizing pad+conv2d into conv2d operation required the intended “level” being taken into account”.
The pad+conv2d → conv2d transformation can remain in “canonicalizations”. The transformation will be updated such that the current “level” is retrieved and the decision about whether or not to fold will be based on the value of “level”. e.g.:
module attributes {tosa.target_env = #tosa.target_env<level = none, profiles = [pro_int, pro_fp], extensions = [int4, int16]>} {
...
}
// Get target information from module
tosa::TargetEnv targetEnv = tosa::lookupTargetEnvOrDefault(op);
// Query capabilities
int max_kernel;
tosa::Level level = targetEnv.getLevel();
auto maxKernelSize = level.getMaxKernelSize();
// Decide whether to fold based on maxKernelSize value
...
Pros
- All transformations immediately have the capability to query target information.
- Single point of reference for target information.
Cons
- Target information exposed globally which may lead to over-dependence on the functionality / incorrect usage.
- Target information cannot change within a module.
- If required in the future, we could consider adding target attributes at the function scope.
Any thoughts / comments on these proposals, or another proposal that wasn’t considered, would be much appreciated, thanks!