Rationale
Many, though not necessarily all, MLIR target platforms support some form of the printf()
call in order to print values that occur within programs at runtime.
The ability to call printf()
is generally helpful for debugging.
For many platforms (such as x86 and other similar CPU targets) the process of calling printf()
is a matter of calling the appropriate library function in the C standard library, which’ll be linked against the code generated via MLIR. However, not all platforms use this mechanism. For example, AMD GPUs running code to HIP use calls to a library of functions such as __ockl_printf_begin()
and __ockl_printf_append_args()
to print values that reside on GPUs, and Clang lowers calls to printf() to these functions before passing GPU kernels to LLVM.
Since the process for calling printf
can vary between targets, but the operation itself is target-independent, I propose we add an op to MLIR that represents printf
, which can be lowered to whichever sequence of operations or function calls is needed to print things out.
The operation
def Somewhere_PrintfOp : Somewhere_Op<"printf", [MemoryEffects<[MemWrite]>]>,
Arguments<(ins StrAttr:$format,
Variadic<AnyTypeOf<[AnyInteger, Index, AnyFloat]>>:$args)> {
let summary = "Calls printf() to print runtime values in generated code";
let description = [{
`somewhere.printf` takes a literal format string `format` and an arbitrary number of scalar arguments that should be printed.
The format string is a C-style printf string, subject to any restrictions
imposed by one's target platform.
}];
let assemblyFormat = [{
attr-dict ($args^ `:` type($args))?
}];
}
This operation has the restrictions that
- It only takes scalars, as it’s unclear how to handle the printing of something like a
vector
ormemref
- The format string is a constant, as MLIR doesn’t have a string type at runtime
Existing work
⚙ D110448 [MLIR][GPU] Define gpu.printf op and its lowerings defines the printf
op under the GPU dialect and adds lowering to the printf() function (which are listed as the “OpenCL” lowering, though it’d apply to most CPUs as well) and to the sequences of function calls needed for the AMD HIP API.
Open questions
- What dialect should this op reside in?
(I’m personally thinking std
might be the right spot, but adding things to std
is dispreferred)