Following yet another discussion in the chat, I propose we remove the “bare pointer calling convention” for memrefs converted to the LLVM dialect.
It was introduced in the early days of MLIR to simplify connection to JIT and to propagate “noalias/restrict” information. Today, it seems to be interpreted as a full-fledged lowering option, which is far from being the case since it only supports statically-shaped fixed-rank memrefs with the default layout. On the other hand, it propagates the dangerous confusion that memref is a pointer and can be treated as such, which now extends to tensors through bufferization. Furthermore, this is the only non-default “calling convention” and there haven’t been any others proposed, removing it will simplify the lowering.
If there are downstreams that rely on this convention, I would urge them to reconsider regardless of the outcome for this RFC. If such a behavior is really desired, it can be turned into a separate pass that rewrites compatible memrefs into LLVM dialect pointers, potentially including the operations on these values.
Generally +1 but I fear the problem is more deeply rooted.
General memref + arbitrary affine map layouts do not map to a well-formed type but rather to an arbitrary bag of pointer + index type. They must be normalized to the canonical layout by propagating all indexing to load/stores, this can have surprising effects with e.g. alloc of memref with weird layout maps.
Strided memrefs have a more restricted and more composable expression. They are now pervasive in e.g. bufferization and structured subset-based codegen. I think the time may have come to turn them into a proper separate buffer type (and even use the opportunity to add the sorely missing refcounting) and have that be the abstraction that passes function boundaries. Along the way we can also make proper use of the layout attribute you added and drop the ill-advised linearized affine map for representing offset, sizes, strides.
I know this is a much larger change than what you propose but separate types + forced conversion from normalized memref to buffer (or alternatively BYO memref type lowering) feels like it would put such issues to rest once and for all.
I agree with the direction! MLIR and tensors/memrefs have significantly evolved since the bare pointer calling convention was introduced. The LLVM IR dialect is now much more composable type-wise so it should be easier for someone to implement an alternative memref lowering.
Having said that, I think that just removing the bare pointer calling convention code may leave the burden to those that need an alternative memref lowering. In the existing conversion to the LLVM dialect, the memref lowering is tightly coupled to the lowering of not only memory related operations but also operations like FuncOp, ReturnOp and CallOp. I wonder if, as part of this effort, we should also consider moving the “default calling convention” lowering to an independent pass. That would be an implementation example for others that need an alternative memref lowering and lead to a more decoupled and composable LLVM IR dialect conversion, which is something that has been discussed a few times in the past. WDYT?