Ok, replying anew now that I understand why reasoning about abstract locations for each object doesn’t work.
The general idea of describing a set of load and stores which belong to a particular invariant group seems reasonable. I’ve got some questions/comments on the specifics, but the overall direction seems entirely workable for the specific problem you’re trying to solve.
Quoting from the google doc: “If we don’t know definition of some function, we assume that it will not call @llvm.invariant.group.barrier().”
This part really really bugs me. We generally try to assume minimal knowledge of external functions (i.e. they can do anything) and this assumption would invert that. Is there a way we can rephrase the proposal which avoids the need for this? I’m not quite clear what this assumption buys us.
Is there a particular reason why a load or store must belong to a single invariant group rather than be a member of several? I don’t have an immediate use case in mind, but it seems potentially useful.
“i8* @llvm.invariant.group.barrier(i8*): Given a pointer, produces another pointer that aliases the first but which is considered different for the purposes of load !invariant.group metadata.”
The definition here minorly bugs me. This might just be a matter of wordsmithing, but it seems strange to me that this can’t be defined in terms of the assumptions allowed about the memory through the two pointers. If I’m looking at this instruction in memory dependence analysis, what am I allowed to assume, not assume? The current definition doesn’t make this obvious. One option: “produces another pointer which aliases the first, but where any invariant assumptions introduced by invariant.group metadata have been striped away.”
The notion of using the assume seems to make sense. I could see an argument for extending the invariant.group metadata with a way to express the assumed value, but we could also make that extension at a later time if needed.
I’m wondering if there’s a problematic interaction with CSE here. Consider this example is pseudo LLVM IR:
v1 = load i64, %p, !invariant.group !Type1
; I called destructor/placement new for the same type, but that optimized entirely away
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
store i64 0, %p2, !invariant.group !Type1
v2 = load i64, %p2, !invariant.group !Type1
ret i64 v1 - v2
(Assume that !Type is used to describe a write once integer field within some class. Not all instances have the same integer value.)
Having CSE turn this into:
v1 = load i64, %p, !invariant.group !Type1
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
store i64 0, %p1, !invariant.group !Type1
v2 = load i64, %p1, !invariant.group !Type1
ret i64 v1 - v2
And then GVN turn this into:
v1 = load i64, %p, !invariant.group !Type1
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
ret i64 v1 - v1 (-> 0)
This doesn’t seem like the result I’d expect. Is there something about my initial IR which is wrong/invalid in some way? Is the invariant.group required to be specific to a single bitpattern across all usages within a function/module/context? That would be reasonable, but I don’t think is explicit said right now. It also makes !invariant.group effectively useless for describing constant fields which are constant per instance rather than per-class.
Philip