TLDR;
This RFC proposes the ptr
dialect to model pointer and low-level memory operations, providing a generalization of the pointer operations in the LLVM dialect, thus making the operations in the dialect reusable and interoperable with high-level dialects.
This RFC is a counter proposal to [RFC] `address` dialect. This proposal arose after chatting with @mehdi_amini , were we agreed that there shouldn’t be type duplication (!llvm.ptr
vs address
) and that some ops could be extracted from LLVM, modularizing the dialect.
Why?
- There’s a need for a reusable pointer type and higher-level pointer and memory operations, as they express ubiquitous concepts, and the lack of thereof limits generic analysis opportunities and IR expressiveness, amongst others.
- It would allow lowering
memref
to a more target-independent representation. - It modularizes a subset of the LLVM dialect.
- To allow the modeling of higher-level concepts like GPU constant memory, fat pointers, etc.
- The possibility of introducing a new optimization layer with low-level pointer alias analysis.
- The bare pointer convention could be applied as a pass in high-level dialects.
See this related discussion, that goes over some of the above points:
Proposal
Extract a subset of LLVM pointer and memory operations into the Ptr dialect and generalize them by making them directly translatable to LLVM IR and lowerable to other backend dialects like SPIR-V.
Concretely move the following LLVM operations to the Ptr dialect:
ptrtoint
andinttoptr
addrspacecast
load
andstore
atomicrw
andcmpxchg
- Creating an opaque pointer type (!ptr.ptr) with a generic address space attribute.
ptr ::= `ptr` (`<` memory-space^ `>`)?
memory-space ::= attribute-value
Additionally, adding the following operations:
ptradd
to add a pointer and an integer and obtain a pointer, see possible implementation and rationaletype_offset
to represent the offset of a type, see possible implementationconstant
to model a constant pointer addressesnullptr
as the concrete value of anullptr
is not always 0from_ptr
andto_ptr
in thememref
dialect, allowing high-level interaction with thememref
dialect. This would ops would serve a similar purpose asfrom_memref
andto_memref
presented in [RFC] `address` dialect
One restriction and design consideration of the pointer dialect is that under the right circumstances (see #llvm.address_space
in the LLVM semantics section), it must faithfully model concepts in LLVM IR.
What about the semantics?
Neither operations nor a pointer type determines pointer and memory semantics. Instead, they are determined by the memory model and encoded in the address space.
This proposal proposes introducing an address space attribute interface to encode the memory model of an address space, thus allowing the dialect and operations to be reused.
This interface would allow specifying higher-level concepts like GPU constant memory by rendering the usage of the StoreOp invalid.
LLVM semantics
LLVM semantics would be specified using the #llvm.address_space
attribute, thus determining if a particular type can be loaded or stored, what address space casts are valid, etc. For example:
%v = ptr.load %ptr : !ptr.ptr<#llvm.address_space> -> f32 // Is valid.
%v = ptr.load %ptr : !ptr.ptr<#llvm.address_space> -> memref<f32> // Is invalid as the type is not loadable.
This attribute would not only encode semantics but also would be used to close the gap between the generality of the ops in Ptr and the correct modeling of the current LLVM Ops. For example, by helping with the extraction of tbaa
metadata from the attribute dictionary in LoadOp
.
What are the default semantics?
Currently, accept everything, but this could be changed.
Why not include getelementptr?
GEP requires interaction with structs, which are not yet a concept outside LLVM. Also, see this related discussion on why ptradd
might be better for optimizations [RFC] Replacing getelementptr with ptradd.
Why type_offset?
This operation is needed to represent type offsets in the absence of a data layout.
Implementation details
Since this is a big change subject to many comments, instead of creating a series of PRs implementing the full change, a minimal proof of concept with the LLVM::LoadOp
can be found in this PR. The objective of this PR is to demonstrate that the approach is feasible, if the proposal is accepted, implementation details can be discussed during review.
Major canges in the PR:
- The
LLVMPointer
is removed in favor ofptr::PtrType
. In order to preserve the IR representation of the type!llvm.ptr
theSharedDialectTypeInterface
interface was added and AsmPrinter.cpp modified so that!llvm.ptr
remains valid (thus no changes in tests are required):
!ptr.ptr<#llvm.address_space<0>> = !llvm.ptr
!ptr.ptr<#llvm.address_space<1>> = !llvm.ptr<1>
- The MemorySpaceAttrInterface interface was added to model address space semantics. See
LLVM::AddressSpaceAttr
for the description of the LLVM interface. The specification ofload
semantics is specified inisValidLoad
. Thus we have that:
%v = ptr.load %ptr : !llvm.ptr -> f32 // Is valid.
%v = ptr.load %ptr : !llvm.ptr -> memref<f32> // Is invalid as the type is not loadable.
%v = ptr.load %ptr : !prt.ptr -> memref<f32> // Is valid as there are no semantics associated.
- If
ptr.load
is used to load from a!llvm.ptr = !ptr.ptr<#llvm.address_space>
then,ptr.load
can be directly translated to LLVM IR.