We recently found a stack overflow issues in MLIR Sparsifier due to the use of
scf.for ([mlir][sparse] fix stack overflow due to alloca instruction inside loops by PeimingLiu · Pull Request #69765 · llvm/llvm-project · GitHub).
Our first attempt to fix the issue is using
memref.alloca_scope. However, it does NOT solve the issue as
scf.for has the
AutomaticAllocationScope trait. As the result,
memref::alloca from the
scf.for, leading to incorrect code.
is canonicalized into
Note that this is a correct transformation provided that
scf.for indeed adjusts stack pointer between iterations (though it doesn’t).
Question: Should we remove the
AutomaticAllocationScope trait from
scf.for or insert
stack.restore when lowering the operation?
Seems to me that the correct canonicalization for
(or any op like
That is the semantics of
memref.alloca_scope should already be implied here.
Of course there is a lowering problem, where every op that has the
AutomaticAllocationScope should lower the exact same way as
You are actually right! There seems to be another
AllocaScopeInliner (MLIR: AllocaScopeInliner Struct Reference). Though, it seems that the
Inliner is not applied after
Hoister in a single canonicalization pass when I tested it.
I do think that the loop should have the trait, it was added exactly to prevent the kind of situation you observed. The lowering is likely incorrect in this case. I vaguely remember a discussion about
llvm.intr.stacksave/restore being unsupported for some targets, but we can always try and see what happens.
Alternatively, we could also just lower to LLVM by placing the alloca in the function entry block, and maybe use the lifetime intrinsics instead?
See use of alloca and lifetime intrinsics in the LLVM
“Performance Tips for Frontend Authors” guide.