Hi all,
We somewhat recently created/updated element-wise unordered-atomic versions of the memcpy, memmove, and memset memory intrinsics:
Memcpy: https://reviews.llvm.org/rL305558
Memmove: https://reviews.llvm.org/rL307796
Memset: https://reviews.llvm.org/rL307854
These intrinsics are semantically similar to the regular versions. The main difference is that the underlying operation is performed entirely with unordered atomic loads & stores of a given size.
Our ultimate purpose is to enable exploitation of the memory intrinsic idioms and optimizations in languages like Java that make extensive use of unordered-atomic loads & stores. To this end, we will eventually have to add these new intrinsics to the instruction class hierarchy, and teach passes how to deal with them; talking about how to do this is the purpose of this RFC.
We have started adding canary tests to some passes, and will continue to do so in preparation for adding the element atomic intrinsics to the instruction class hierarchy. These canary tests are simply placeholders that show that the pass in question currently does nothing with the new element atomic memory instrinics, and could/should generally start failing once the unordered-atomic memory intrinsics are added to the instruction class hierarchy — telling us which passes need to be updated. For example: https://reviews.llvm.org/rL308247
For adding the new intrinsics to the instruction class hierarchy, the plan will be to add them one at a time — add the intrinsic to the hierarchy, and flush out all problems before adding the next. We’re thinking that it would be best to start with memset, then memcpy, then memmove; memset is kind of it’s own thing, and there are some passes that change memcpy/memmove into a memset so it would be beneficial to have that in place before memcpy/memmove, and there are also passes that change memmove into memcpy but I haven’t found any that go the other way (I can’t imagine why you’d want to) so it would be good to have memcpy squared away before doing memmove.
There are a few options that we have thought of for adding the new memory intrinsics to the instruction class hierarchy. We have a preference for the first option, but would appreciate thoughts/opinions/discussion on the best way to insert the element atomic intrinsics into the hierarchy. For background, the relevant portion of the current hierarchy looks like:
MemIntrinsic
- MemSetInst
- MemTransferInst
** MemCpyInst
** MemMoveInst
Option 1)