I’ve looked around a bit and with a minor change to MemoryDependenceAnalysis
, I can enable the following transformation which is not possible without MMRAs.
define i32 @test_fenced(ptr %in, ptr %out) {
; CHECK-LABEL: define i32 @test_fenced(
; CHECK-SAME: ptr nocapture readonly [[IN:%.*]], ptr nocapture writeonly [[OUT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
; CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[IN]], align 4, !mmra !0
; CHECK-NEXT: store i32 [[TMP1]], ptr [[OUT]], align 4
; CHECK-NEXT: fence acq_rel, !mmra !1
; CHECK-NEXT: ret i32 [[TMP1]]
;
%1 = load i32, ptr %in
store i32 %1, ptr %out
fence acq_rel, !mmra !{!"vulkan", !"nonprivate"}
%y = load i32, ptr %in, !mmra !{!"vulkan", !"private"}
ret i32 %y
}
The idea is that a fence isn’t considered as a dependence to load %in
because %y
and the fence have incompatible MMRAs, so we can safely load it earlier (and eventually merge it with %1
). I think this is safe but more experimentation is needed.
This also highlights another important point in MMRA’s design: dropping metadata cannot affect correctness as an empty set is always compatible with any set of tags.
Hence, if MMRA is dropped anywhere here, the worst that can happen is restoring compatibility and inhibiting the optimization, but the code will always be correct no matter what.