[RFC] Unify memory effect attributes

Status Quo

Currently, memory effects for functions/calls are specified using two sets of attributes:

# Kind of access
readnone
readonly
writeonly

# Accessed location
argmemonly
inaccessiblememonly
inaccessiblemem_or_argmemonly

This representation is imprecise: For example, if a function can read any memory, but only write argument memory, there is no way to represent this right now. We can neither restrict the access kind, nor the accessed location.

Additionally, the current representation is not extensible. The coroutine thread identity problem would be best solved by a new “thread id” memory location, which is read by @llvm.threadlocal.address. Unfortunately, the current attribute encoding is very hard to extend, because locations are encoded in a combinatorial manner. (Other new location kinds have been discussed in the past as well, such as globalmemonly. They all suffer from the same problem.)

FunctionModRefBehavior

Alias analysis models the high-level memory effects of functions using FunctionModRefBehavior. This class stores separate ModRefInfo for each memory location kind:

ArgMem:          NoModRef/Ref/Mod/ModRef
InaccessibleMem: NoModRef/Ref/Mod/ModRef
Other:           NoModRef/Ref/Mod/ModRef

This allows us to both encode the existing memory attributes, as well as behaviors that cannot be expressed using them. Some examples:

readonly
=> ArgMem: Ref, InaccessibleMem: Ref, Other: Ref

readonly argmemonly
=> ArgMem: Ref, InaccessibleMem: NoModRef, Other: NoModRef

"can read any memory, can only write to arguments"
=> ArgMem: ModRef, InaccessibleMem: Ref, Other: Ref

This system is also easy to extend towards new kinds of memory locations.

Proposal

The proposal is to replace the existing memory attributes with a single memory effect attribute that stores the FunctionModRefBehavior.

Internally this would be an integer attribute, but it would be publicly exposed either via FunctionModRefBehavior, or existing shim methods (like onlyReadsMemory()).

The precise representation in textual IR is up for debate, but my my current idea would be do have a memory(...) attribute, which can be used to specify a default access kind, and then overwrite it for specific locations. Some examples, including translation of existing attributes:

# readnone
declare @foo(ptr %p) memory()

# readonly
declare @foo(ptr %p) memory(r)

# writeonly
declare @foo(ptr %p) memory(w)

# argmemonly
declare @foo(ptr %p) memory(argmem: rw)
# inaccessiblememonly
declare @foo(ptr %p) memory(inaccessiblemem: rw)
# inaccessiblemem_or_argmemonly
declare @foo(ptr %p) memory(argmem: rw, inaccessiblemem: rw)

# argmemonly readonly
declare @foo(ptr %p) memory(argmem: r)
# inaccessiblememonly writeonly
declare @foo(ptr %p) memory(inaccessiblemem: w)

# Read any, write args
declare @foo(ptr %p) memory(r, argmem: rw)

In terms of API impact, this change should not affect APIs that only query memory effects (like onlyReadsMemory()). However, APIs that set attributes on functions/calls would now have to set the memory effect attribute instead.

Addition of new memory location kinds

This proposal by itself will not introduce any new memory location kinds, but this is expected to happen in follow-up proposals, and we should consider the impact this will have under this proposal.

New memory locations will usually be split off from the “other” category. For example, global memory is currently part of “other”, but could be split into globalmem. However, this is not always the case: The aforementioned threadid location would be entirely new, and is not part of the current memory model at all.

Memory attributes in old bitcode will be auto-upgraded to include the new location kind. In most cases, it will be populated by copying the access kind from “other” (again, threadid might need different handling, such as unconditionally adding a threadid read everywhere).

Textual IR is not subject to an explicit auto-upgrade. However, the fact that the memory() attribute can specify a default access kind means that this access kind will also be used for any newly introduced memory location kinds. For example, if a new globalmem location is introduced, then memory(r, argmem: rw) will retain the intended meaning of “read anything, write arguments”, because the default access kind also covers the new location.

To facilitate this, when printing IR, LLVM will produce memory attributes in canonical form, where the access kind of other is printed as the default access kind, for example memory(argmem: rw, inaccessiblemem: r, other: r) is printed as memory(r, argmem: rw). This ensures that any newly added locations will inherit the access kind of “other”.

2 Likes

+2 from me and I’m happy to help, e.g., review and write code.

Bike-shedding aside, this is long overdue. There might even have been suggestions for this in the past :wink:

I wonder if it’s a problem for the default when a particular “kind” is unmentioned for it to be “doesn’t read or write”, instead of being explicit. In particular, this seems to not be compatible with the desired noread_thread_id semantics. Under the proposal for adding that attr, it was intended that readnone – which by this proposal becomes memory("") – does NOT exclude reading from threadid. But if threadid becomes a “kind of memory”, and added to this proposal, then we seem to have an issue?

This is an issue in the sense that if new location kinds are added, frontends may need to account for them. For adding something like globalmem, this likely does not require any changes, because something like FunctionModRefBehavior::readOnly() would just include the new location.

However, for threadid in particular, we are altering the memory model in a fundamental way: We’re not just splitting out a location kind from “other”, we’re adding an entirely new one. This means that __attribute__((pure)) now needs to switch from something like FunctionModRefBehavior::none() to FunctionModRefBehavior::threadId(ModRefInfo::Ref).

For bitcode, we can of course auto-upgrade such cases, and either unconditionally add a new location (the threadid case) or add them if other was previously set (the globalmem case). Though if anything gets added before LLVM 16, then we don’t have to worry about that autoupgrade.

For IR tests, this depends on whether a specific test cares about thread IDs or not – if it does care, then it might need adjustment in one or the other direction.

The tl;dr here is that I don’t think this is a problem with this proposal so much as an issue with changing the memory model – it’s kind of unavoidable that this will result in code changes somewhere.

Edit: Maybe worth highlighting that the “unmentioned kind means doesn’t read/write” is only an artifact of textual IR and as such only relevant for testing purposes. For everything else (in-memory representation and bitcode) FMRB always encodes everything explicitly.

The “naming” is what I ignored for now as “bike-shedding”. I doubt we want an empty string (or any string for predefined values). If we decide to go down this road, I would avoid arbitrary extensions and make things explicit so that IR is readable.

Wrt. the thread_id, my initial proposal was:

side_effect(read:TLS,argmem; write:globals,cross-thread-registers; synchronizing(warp))

Some earlier suggestions were pretty similar.

As you can see, I’d also go beyond “memory” right away. We use “memory” for too long to model things that are not “memory” related, we could use this opportunity to fix this.

FWIW, in the Attributor we actually determine if a function accesses various locations kinds:

     ALL_LOCATIONS = 0,
     NO_LOCAL_MEM = 1 << 0,
     NO_CONST_MEM = 1 << 1,                                                      
     NO_GLOBAL_INTERNAL_MEM = 1 << 2,
     NO_GLOBAL_EXTERNAL_MEM = 1 << 3,
     NO_GLOBAL_MEM = NO_GLOBAL_INTERNAL_MEM | NO_GLOBAL_EXTERNAL_MEM,
     NO_ARGUMENT_MEM = 1 << 4,
     NO_INACCESSIBLE_MEM = 1 << 5,
     NO_MALLOCED_MEM = 1 << 6,
     NO_UNKOWN_MEM = 1 << 7,

The problem with auto-upgrade is that the IR is not versioned. We will never know if a memory was generated by a recent or old frontend.

Having something upgradable makes sense I guess. In the future we may want to split the memory further, to allow for more inter-procedural information passing.
We’ve already noticed that inaccessible memory is not a good abstraction for LTO as what wasn’t accessible may become after linking. We may want something that explicitly names the globals that a function can or not access, for example.

That’s a more complicated design, where you could have arbitrary memory labels, from an individual global variable to all globals, or the whole memory. globals is just an alias to the set of globals present in the function’s file and in the files where all transitively functions are located. Once LTO is done that information is widened to all globals in all files.

LTO is fairly niche still, so I don’t know how relevant inter-procedural alias analysis is, though.

Auto-upgrade is a bitcode-only concept, it does not apply to textual IR. While bitcode itself is not versioned, such auto-upgrades are still easily supported: The way this is done in this context is to introduce a new bitcode-level attribute kind whenever the underlying representation changes, and then auto-upgrade old representations. We already commonly do this for other bitcode record kinds.

Possibly I’m misunderstanding your comment, but just to be clear: The location kinds supported in memory() is supposed to be a fixed set (at a given time), namely exactly those locations supported by FunctionModRefBehavior. We can drop the string and spell this memory(argmem: r) or memory(argmem: read) or memory(read: argmem). The only reason I used a string in the first place is that not using one requires adding a lot of new tokens to the lexer… probably a bad motivation :wink:

One thing I want to avoid is both changing the representation, and adding new “location kinds” at the same time. Once the representation change is done, we should be in a very good place for making extensions with relatively low effort. But getting there will probably be tricky enough as-is, without mixing in additional semantic changes.

+1 to this. I’ve wished for readonly/writeonly/readnone to be simplified to noread/nowrite for a long time, your proposal goes quite a bit beyond that but that’s great, too.

The one thing I’d point out is that upgrades become safer by default if what’s written in textual IR is expressed negatively, e.g. nowrite:TLS,argmem; noread:globals,cross-thread-registers, noaccess:inaccessible. If we identify a new kind of location later (as happened with the threadid), the upgrade process becomes smoother.

The tradeoff is that negations are often more awkward to reason about.

2 Likes

I think negative accumulation in IR makes a lot of sense. Dropping it is safe, adding new “kinds” is also safe if we make noread/nowrite the main “categories” that list kinds (, as you’ve shown).

That’s why we will have getters that are positive. People that actually read IR will get used to it :wink:

Fair, but we might still want to call it side-effect and only model memory for now. WDYT?

So, my general position here is that the primary purpose of textual IR is to write tests. In this context, if a new memory location kind is added, we actually don’t want it to be automatically added to existing memory() attributes in most cases. For example, a typical test scenario is memory(argmem: rw), for a function that should not clobber escaped pointers. If a new location kind is added, we do not want this to automatically turn into memory(argmem: rw, globalmem: rw).

For this reason, I expect that using a negative specification would actually require us to go through and update many tests if a new location is added (adding noglobalmem everywhere), while a positive specification will not.

The main case where we do want to new locations to be automatically included is cases that were not limited to specific locations in the first place. This is covered by the shortcuts of the form memory(r), which don’t list locations, and as such will automatically cover new locations if they are added.

I think the only somewhat problematic case is things like memory(argmem: rw, inaccessiblemem: r, other: r), where the actual intent it “reads anything, writes arguments”. In this case, we would want this to also read from newly added locations. I think this can be covered by allowing you to write memory(r, argmem: rw), where r is the default access mode for any locations not explicitly listed.

So basically, I think the memory effect specification in textual IR should mirror the intent of the author – and if it does, then newly added location kinds will likely be handled according to that intent. If only a fixed set of locations is specified, that fixed set will be preserved. If a default access mode is specified, then that default will also apply to new locations.

(We can preserve the intent only for test IR written by humans – for compiler-regenerated IR dumps it would come in some kind of canonical form.)

After thinking about this a bit more, I think what would work well for the canonical printed form is to pick the access mode of other as the default mode. For example, if we have memory(argmem: rw, inaccessiblemem: r, other: r) in fully written out form, then the default printing for this should be memory(r, argmem: rw), by picking other: r as the default access mode.

The rationale for this is that new memory location kinds would usually get split out of what is currently other, and this would automatically give them the right access mode. (threadid is the counter-example to this, in that it is not part of other right now. Though this probably still works well as a heuristic even for that case.)

One could even go a step further and not allow specifying other in textual IR at all, and only allow specifying it through the default mode.

I have updated the proposal text with some updates to the IR syntax, as well as an explicit discussion of what happens if new locations are added. I hope that mostly addresses the “auto-upgrade” concerns.

The overall direction looks like a great improvement!

As long as we are very clear what the implications of specifying other is that should be fine I think. That would probably also mean that in practice other will be set conservatively in most case.

FWIW, I do like this proposal of having a default access mode.

LGTM +1. Let me know if there is any thing I can help.

Then ⚙ D132352 Introduce noread_thread_id to address the thread identification problem in coroutines should be deprecated. And I am wondering if we can land ⚙ D127383 Don't treat readnone call in presplit coroutine as not access memory temporarily and revert it after we complete this? Since it looks like a long-term project and I am not sure if we can make it in Clang/LLVM16. It is really a pity that we didn’t fix the thread identification problem in clang15.

The patch you want to land temporarily doesn’t look like the “simplest” fix for the problem but rather a complex one that will not be necessary as this rolls out. I’d personally prefer a very narrow approach for “temporary fixes”, if any.

I sent ⚙ D135550 [Coroutines] Don't merging readnone calls in presplit coroutines. This is the narrowest approach that I can image. I wish we could land this otherwise we can only wait for unifying the memory effecting attributes…

I’ve put up the first series of patches for this change:

This adds support for the attribute without wiring it up to anything yet (and thus also not affecting any of the existing attributes yet).

Relative to the proposal text, I ended up picking a slightly more explicit syntax for the access kinds, so it’s memory(argmem: read) rather than memory(argmem: r).

1 Like

What’s the plan for things like readnone arguments? Will this also replace per-argument memory attributes?