Status Quo
Currently, memory effects for functions/calls are specified using two sets of attributes:
# Kind of access
readnone
readonly
writeonly
# Accessed location
argmemonly
inaccessiblememonly
inaccessiblemem_or_argmemonly
This representation is imprecise: For example, if a function can read any memory, but only write argument memory, there is no way to represent this right now. We can neither restrict the access kind, nor the accessed location.
Additionally, the current representation is not extensible. The coroutine thread identity problem would be best solved by a new “thread id” memory location, which is read by @llvm.threadlocal.address
. Unfortunately, the current attribute encoding is very hard to extend, because locations are encoded in a combinatorial manner. (Other new location kinds have been discussed in the past as well, such as globalmemonly. They all suffer from the same problem.)
FunctionModRefBehavior
Alias analysis models the high-level memory effects of functions using FunctionModRefBehavior
. This class stores separate ModRefInfo for each memory location kind:
ArgMem: NoModRef/Ref/Mod/ModRef
InaccessibleMem: NoModRef/Ref/Mod/ModRef
Other: NoModRef/Ref/Mod/ModRef
This allows us to both encode the existing memory attributes, as well as behaviors that cannot be expressed using them. Some examples:
readonly
=> ArgMem: Ref, InaccessibleMem: Ref, Other: Ref
readonly argmemonly
=> ArgMem: Ref, InaccessibleMem: NoModRef, Other: NoModRef
"can read any memory, can only write to arguments"
=> ArgMem: ModRef, InaccessibleMem: Ref, Other: Ref
This system is also easy to extend towards new kinds of memory locations.
Proposal
The proposal is to replace the existing memory attributes with a single memory effect attribute that stores the FunctionModRefBehavior.
Internally this would be an integer attribute, but it would be publicly exposed either via FunctionModRefBehavior, or existing shim methods (like onlyReadsMemory()
).
The precise representation in textual IR is up for debate, but my my current idea would be do have a memory(...)
attribute, which can be used to specify a default access kind, and then overwrite it for specific locations. Some examples, including translation of existing attributes:
# readnone
declare @foo(ptr %p) memory()
# readonly
declare @foo(ptr %p) memory(r)
# writeonly
declare @foo(ptr %p) memory(w)
# argmemonly
declare @foo(ptr %p) memory(argmem: rw)
# inaccessiblememonly
declare @foo(ptr %p) memory(inaccessiblemem: rw)
# inaccessiblemem_or_argmemonly
declare @foo(ptr %p) memory(argmem: rw, inaccessiblemem: rw)
# argmemonly readonly
declare @foo(ptr %p) memory(argmem: r)
# inaccessiblememonly writeonly
declare @foo(ptr %p) memory(inaccessiblemem: w)
# Read any, write args
declare @foo(ptr %p) memory(r, argmem: rw)
In terms of API impact, this change should not affect APIs that only query memory effects (like onlyReadsMemory()
). However, APIs that set attributes on functions/calls would now have to set the memory effect attribute instead.
Addition of new memory location kinds
This proposal by itself will not introduce any new memory location kinds, but this is expected to happen in follow-up proposals, and we should consider the impact this will have under this proposal.
New memory locations will usually be split off from the “other” category. For example, global memory is currently part of “other”, but could be split into globalmem
. However, this is not always the case: The aforementioned threadid
location would be entirely new, and is not part of the current memory model at all.
Memory attributes in old bitcode will be auto-upgraded to include the new location kind. In most cases, it will be populated by copying the access kind from “other” (again, threadid might need different handling, such as unconditionally adding a threadid read everywhere).
Textual IR is not subject to an explicit auto-upgrade. However, the fact that the memory()
attribute can specify a default access kind means that this access kind will also be used for any newly introduced memory location kinds. For example, if a new globalmem
location is introduced, then memory(r, argmem: rw)
will retain the intended meaning of “read anything, write arguments”, because the default access kind also covers the new location.
To facilitate this, when printing IR, LLVM will produce memory attributes in canonical form, where the access kind of other is printed as the default access kind, for example memory(argmem: rw, inaccessiblemem: r, other: r)
is printed as memory(r, argmem: rw)
. This ensures that any newly added locations will inherit the access kind of “other”.