For fine-grain performance trancing and for industrial code traceability it would be interesting to be able to connect the components of the generated code (e.g. dispatches in the iree VM, or top-level loops in a piece of sequential code) to the operations of the input specification (e.g. linalg.matmul).
The connection does not need to be 1-to-1. Thus, loop fusion would typically associate to one loop in the final code multiple input operations. Similarly, only the top-level affine.for generated for a linalg.matmul would be associated with the initial operation.
Is there some support in MLIR for this?
If not, are there other people here interested in traceability?
One immediate application would be fine-grain performance tracing.
I have found the dispatch_profiler of iree, but (I may be wrong) it seems to only take as input single operations, such as linalg.matmul. My objective would be to profile full applications.
For approximative tracing like you seems to seek for, we are using “debug location”: every operation has these and they flow through the pipeline. Ultimately when using LLVM they end up in Dwarf potentially.
For very fine grain and precise tracing I proposed this recently: [RFC] Introduce the concept of IR listeners in MLIR
Yes, MLIR tracks source locations step by step, which lets you map back from compiled code to the input program (or anywhere in between). As operations are created, they can either choose to reuse a source location from a single predecessor, create a new “fused” location from multiple source locations, or create a new source location.
Since you asked about IREE, we do most of our in-depth performance analysis of the IREE compiler and IREE runtime using Tracy (IREE docs here).
One way we connect generated code back to input IR is shown in Show executable sources via source locs in Tracy if available. by ScottTodd · Pull Request #9994 · openxla/iree · GitHub. For that, we snapshot the IR at some point during compilation to disk as .mlir files then, at runtime, each function in the compiled program tells the profiler that it’s “source location” resides in one of those files on disk.
If you want to try Tracy without downloading it, I have a little web demo hosted here: Tracy Profiler