[RFC?] Store to load forwarding

Hi everyone,

I’d like to discuss the need and feasibility of generic dataflow optimizations in MLIR.

It appears that everyone around here is doing some sort of store to load forwarding optimization:

  • Core LLVM has a mem2reg pass
  • Affine dialect has AffineScalarReplacement pass
  • flang has MemRefDataFlowOpt MemRefDataFlowOpt

It appears, that for the most part these optimizations do the same thing: they discover some store-like operations and try to promote them to nearest load operations.

And it seems like store/load semantics can be represented with an interface: all these operations may have some sort of subscript operator, i.e. we can get a list of indices to load/store data to; they have some sort of base memory reference to calculate the final address from; stores have some scalar-like value to write to the memory.

So, I’m wondering if there’s any reason for these passes to be separate? Would it make sense to add a new interface to MLIR, make dialects implement it, and add a generic mem2reg pass? Would it be beneficial to anyone?

1 Like

+1 for me, having something much more general here would be most welcome!

1 Like

That be nice to design and implement!
Some difficulty with this may be in the lack of alias analysis, or even proper memory semantics defined around aliasing and things like “volatile” or “atomic”.

That and also operations like getelementptr or AccessChain. But I think both can be solved to some extent with proper interfaces, and MemRefDataFlowOpt is a good enough baseline for the first implementation.

1 Like

This isn’t just a mem2reg. It can also eliminate multiple redundant loads. The actual techniques and mechanics to do the forwarding would also be quite different (with affine load/store ops) from the other two you have in the list – standard SSA mem2reg and flang’s. Using an interface could at least merge the latter two and that would be great. MLIR is separately is missing a mem2reg pass on its memref/CFG form.

Polygeist has a mem2reg pass that works on MemRef and LLVM dialect in presence of SCF control flow. The code is here Polygeist/Mem2Reg.cpp at main · llvm/Polygeist · GitHub and needs a bunch of cleanup, but it is pretty robust in practice.

@wsmoses @lorenzo_chelini


For our mlir python compiler we have memorySSA anlysis for memrefs and some store to load forwarding and dead store elimination transforms based on it.


But they are not very extensively tested and not ready for upstream.

It’ll be great to have this upsream.