Background
One of the many benefits of using MLIR is the ability to analyze code at different levels of abstraction. One of these analyses that my team (@kaitingwang and @frank_gao) and I think are quite useful is lifetime analysis, where the lifetime of pointers (the points in execution in which a pointer is in use) is known and can be used for optimizations such as buffer reuse.
Through optimization passes in MLIR, such as the Bufferization passes, buffers are created. At which time the lifetime of the buffers would be best known. Once these buffers are lowered to lower-level dialects, it becomes increasingly difficult to analyze their lifetimes (especially if they are mixed into different dialects).
Within LLVMIR, lifetime is marked using the llvm.lifetime.start
and llvm.lifetime.end
intrinsics, and there exists analysis passes within the LLVM project that use these lifetimes for optimizations.
https://llvm.org/docs/LangRef.html#int-lifestart
Proposed Addition
We propose that MLIR should also have the capability of explicitly declaring the lifetime of buffers within the IR itself. Since, especially in higher-level dialects, it is improper to work with pointers within MLIR, we propose that two new ops be contributed to the MemRef dialect:
memref::LifetimeStartOp
which would lower to theLLVM::LifetimeStartOp
memref::LifetimeEndOp
which would lower to theLLVM::LifetimeEndOp
Within the IR, the code would look something like this:
func.func @lifetime(%d : index) {
%0 = memref.alloca() : memref<32xf32>
%1 = memref.alloca(%d) : memref<?xf32>
// ... (no previous uses)
memref.lifetime_start %0 : memref<32xf32>
// First use of %0
memref.lifetime_start %1 : memref<?xf32>
// ... (some uses of the memrefs)
memref.lifetime_end %0 : memref<32xf32>
memref.lifetime_end %1 : memref<?xf32>
// ... (no further uses)
return
}
Which would then be lowered to the LLVM dialect like so:
func.func @lifetime(%d : index) {
// ...
%A = unrealized_conversion_cast %0 : memref<32xf32> -> llvm.struct<...>
llvm.intr.lifetime.start %A.aligned, 128
// ...
%B = unrealized_conversion_cast %1 : memref<?xf32> -> llvm.struct<...>
llvm.intr.lifetime.start %B.aligned, -1
// ...
return
}
The benefits of this approach is that it allows for the lifetime of the buffers to be declared at a higher-level dialect (when the buffers are created) and the information can be passed to LLVM for more complex lifetime analysis, while being non-intrusive to other dialects while lowering. This would also open the opportunity for high-level lifetime analysis and optimizations to be implemented in MLIR itself.
Alternative Solution
Alternatively, the lifetime information could be encoded into an op like the memref::AllocaScopeOp
:
https://mlir.llvm.org/docs/Dialects/MemRef/#memrefalloca_scope-memrefallocascopeop
However, there are a few issues with using this op for this purpose at the moment:
Firstly, this op is currently being lowered into LLVM::StackSaveOp
and LLVM::StackRestoreOp
. Functionality could be added into the MemRefToLLVM conversion pass to emit the lifetime information; however, this would require more options to be added to allow a toolchain to decide whether the op would emit stack save/restore or lifetime information.
Secondly, lifetime is not specific to stack allocations; the lifetime of any MemRef is of interest for lifetime analysis and transformations. Since the AllocaScopeOp is specifically for stack allocations, the lifetime of heap allocated MemRefs (memref::AllocOp
s) would be left behind. This would be hard to fix, since the semantics of this op are specific to stack allocations.
Finally, although the use of a scoped region to define the lifetime of a single memref is quite elegant, to use this for each and every allocation would not work since it is possible for the lifetime of two buffers to overlap, which would require the scopes to be nested. This causes a problem for when the lifetime of buffer A begins before buffer C, but the lifetime of buffer A ends before buffer C, as shown in the example below:
memref.alloca_scope {
%A = memref.alloca(): memref<32xf32>
// Initialize %0
memref.alloca_scope {
%B = memref.alloca(): memref<32xf32>
// Initialize %1
memref.alloca_scope {
%C = memref.alloca(): memref<32xf32>
affine.for %i = 0 to 32 {
%0 = affine.load %A[%i] : memref<32xf32>
%1 = affine.load %B[%i] : memref<32xf32>
%2 = arith.addf %0, %1 : f32
affine.store %2, %C[%i] : memref<32xf32>
}
// Buffer A's lifetime ends here; however, C's may not necessarily end here.
}
}
// Cannot continue to use C here since it was declared within the scope of A
}
For these reasons, we believe that adding the lifetime annotations as explicit ops within the MemRef dialect is more appropriate.