I am looking at bug https://bugs.llvm.org/show_bug.cgi?id=42733 and I don't understand how
MemoryDependenceAnalysis is supposed to work together with GVN.
Could someone clarify few questions?
1. Function comment of MemoryDependenceAnalysis::getNonLocalCallDependency() says that
it returns "...the set of instructions that either define or clobber the value".
I read it as it can return incomplete set of dependencies (excluding NonLocal ones).
How then it can be used by GVN, which needs full set of dependencies to work correctly?
More generally, what is the purpose of such interface at all? Am I missing something here?
2. With empty cache, MemDepAnalysis in fact adds Invalid (NonLocal) Dependency to the
result if PHITransAddr cannot fully translate it, but after processing of other blocks
caused entry for address to appear in cache, it will be ignored being NonLocal and not reported.
This is confusing, as I do not expect for cache to affect result?
3. What is the _exact_ purpose of 'BBSkipFirstBlockPair Pair' field of NonLocalPinterInfo struct?
Looks like it serves to record entry BB of address query, but I see that it can change until
cache entry is not empty. Is it intended behavior?
4. PHITransAddr can translate GEP to another GEP which resides in a block from which original
query block is not reachable. In the example below, it can translate GEP %tmp28 from bb26 to
GEP %tmp14 from bb12, which is return block! Is it bug? If not, what's the purpose of it?
(MustDominate flag is set to false here - this is how GVN calls into MemDepAnalysis)
A testcase from PR42733 looks like this:
bb8: ; preds = %bb6
%tmp9 = load atomic i64 addrspace(1)*, i64 addrspace(1)** %arg unordered, align 8
br i1 %arg2, label %bb11, label %bb10
bb10: ; preds = %bb8
br label %bb15
bb11: ; preds = %bb8
br i1 %arg3, label %bb12, label %bb18
bb12: ; preds = %bb11
%tmp13 = phi i64 addrspace(1)* [ %tmp9, %bb11 ] <<< Note this PHI!
%tmp14 = getelementptr inbounds i64, i64 addrspace(1)* %tmp13, i64 8
store atomic i64 1, i64 addrspace(1)* %tmp14 unordered, align 8
ret i64 0
bb15: ; preds = %bb26, %bb10
%tmp16 = phi i64 addrspace(1)* [ %tmp9, %bb10 ], [ %tmp27, %bb26 ]
%tmp17 = phi i64 [ %tmp7, %bb10 ], [ 0, %bb26 ]
switch i32 %arg4, label %bb19 [
i32 0, label %bb26
i32 1, label %bb23
bb18: ; preds = %bb11
br label %bb19
bb19: ; preds = %bb18, %bb15, %bb6
%tmp20 = phi i64 addrspace(1)* [ %tmp16, %bb15 ], [ null, %bb6 ], [ %tmp9, %bb18 ]
%tmp21 = getelementptr inbounds i64, i64 addrspace(1)* %tmp20, i64 8
%tmp22 = load atomic i64, i64 addrspace(1)* %tmp21 unordered, align 8, !tbaa !0
br label %bb6
bb23: ; preds = %bb15
%tmp24 = getelementptr inbounds i64, i64 addrspace(1)* %tmp16, i64 8
%tmp25 = load atomic i64, i64 addrspace(1)* %tmp24 unordered, align 8
call void @baz(i64 %tmp25) #0
ret i64 0
bb26: ; preds = %bb15
call void @bar()
%tmp27 = load atomic i64 addrspace(1)*, i64 addrspace(1)** %arg unordered, align 8
%tmp28 = getelementptr inbounds i64, i64 addrspace(1)* %tmp27, i64 8
%tmp29 = load atomic i64, i64 addrspace(1)* %tmp28 unordered, align 8
%tmp30 = getelementptr inbounds i64, i64 addrspace(1)* %tmp27, i64 40
store atomic i64 %tmp29, i64 addrspace(1)* %tmp30 unordered, align 4
br label %bb15
GVN performs incorrect load replacement for %tmp25 at bb23.
On first GVN iteration, we start querying MD for %tmp25.
It takes GEP %tmp24 and walks CFG up to bb15 and from there it walks to bb10 trying to translate %tmp9
(branch to bb6 with query for %tmp27 resolve successfully, so I won't mention it further)
Here it stops, because PHI node %tmp13 at bb12 prevents PHITransAddr from finding node %tmp14 (it is not direct user of
%tmp9). So PHITransAddr returns null, and getNonLocalPointerDepFromBB() adds invalid/nonlocal MemDepResultEntry to the
result vector. (Finally we have two deps reported on first iteration: %tmp28 from bb6 and invalid entry from bb10)
Later on, redundant PHI %tmp13 = phi i64 addrspace(1)* [ %tmp9, %bb11 ] is deleted and during processing of block
bb19, GEP %tmp14 gets into cache as NonLocal entry with starting block of bb10.
(Note, %tmp14 _cannot_ reach bb10!)
Then, since on first iteration we have changed something, GVN proceeds to second iteration.
Process repeats, but now translation of PHI %tmp16 translates to %tmp14 and founds valid cache
entry for it (it now is direct user of %tmp9).
But since it is NonLocal entry, it is simply ignored and not reported back to GVN as memory dependency...
Any idea how to properly fix this?
I have seen a review in which all uses of invalidated cache entry are invalidated as well.
While it looks like it will fix my problem, it does not sounds right in this case...