Question about hoisting LoadInst in LICM pass using MemorySSA/AliasAnalysis

Hi All,

I have a question about hoisting LoadInst in LICM pass. I am looking at below IR code.

@a = dso_local local_unnamed_addr global i32** null, align 8

define dso_local void @foo(i32 %max) {

entry:

br label %for.cond

for.cond: ; preds = %for.body, %entry

%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]

%cmp.not = icmp sgt i32 %i.0, %max

br i1 %cmp.not, label %for.cond.cleanup, label %for.body

for.cond.cleanup: ; preds = %for.cond

ret void

for.body: ; preds = %for.cond

%0 = load i32**, i32*** @a, align 8, !tbaa !8

%idxprom = zext i32 %i.0 to i64

%arrayidx = getelementptr inbounds i32*, i32** %0, i64 %idxprom

store i32* null, i32** %arrayidx, align 8, !tbaa !8

%inc = add nuw nsw i32 %i.0, 1

br label %for.cond, !llvm.loop !12

}

I have expected the %0 = load is hoisted to entry block because I think it is loop invariant. However, LICM pass fails to hoist it because the MemorySSA is as below.

define dso_local void @foo(i32 %max) local_unnamed_addr #0 {

entry:

br label %for.cond

for.cond: ; preds = %for.body, %entry

; 2 = MemoryPhi({entry,liveOnEntry},{for.body,1})

%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]

%cmp.not = icmp sgt i32 %i.0, %max

br i1 %cmp.not, label %for.cond.cleanup, label %for.body

for.cond.cleanup: ; preds = %for.cond

ret void

for.body: ; preds = %for.cond

; MemoryUse(2) MayAlias

%0 = load i32**, i32*** @a, align 8, !tbaa !8

%idxprom = zext i32 %i.0 to i64

%arrayidx = getelementptr inbounds i32*, i32** %0, i64 %idxprom

; 1 = MemoryDef(2)

store i32* null, i32** %arrayidx, align 8, !tbaa !8

%inc = add nuw nsw i32 %i.0, 1

br label %for.cond, !llvm.loop !12

}

As we can see there is MemoryUse(2) MayAlias above the %0 = load because AliasAnalysis returns MayAlias between the ‘%0 = load’ and ‘store i32* null’. I think it could be MemoryUse(liveOnEntry). How do you think about it? If I missed something, please let me know.

Thanks,

JinGu Kang

Ping

Hey JinGu!

I don’t know the answer to this from a MemorySSA / LICM perspective (I’m not deeply aware of their internals) - but I do remember having to fix this exact kind of issue with our Burst HPC# compiler with our own custom alias analysis. >From our perspective it was safe to assume that any pointer deriving from @a couldn’t alias with it, and we could make this optimization (this was really important in LLVM 6/7 timeframe if I’m remembering it right because we had a lot of cases where these kind of things would disable vectorization).

Cheers,
-Neil.

Hello Neil-nim,

Thanks for kind reply.

I am fixing https://bugs.llvm.org/show_bug.cgi?id=52143. If we can fix the case with AliasAnalysis, it would be a fundamental solution. If possible, can I ask you how you fixed it like adding a pattern in BasicAA, running other existing AA or your own one and etc please? If Neil-nim pushes the solution to upstream, it will be best. :blush:

Cheers,

JinGu Kang

~WRD000.jpg

Hey JinGu-nim,

Now I’ve looked at that example - isn’t the problem that TBAA isn’t doing its job here? unsigned* should be distinct from unsigned** and TBAA should say that the pointers can’t alias?

We don’t have TBAA in Burst, so that’s why we had to roll our own alias analysis with our specific characteristics!

Cheers,
-Neil.

~WRD000.jpg

Neil-nim,

The load and store have same TBAA MDNode so the TBAA returns MayAlias. I am not sure we can solve this case on the TBAA…

As we can see on the IR example, the load and store have WAR dependency at the first iteration of the loop and there is no dependency at rest of iterations of the loop. I think it could be good to check this information somewhere to hoist the load.

Thanks

JinGu Kang

~WRD000.jpg

Hello Neil-nim,

I have thought about what you mentioned.

Now I’ve looked at that example - isn’t the problem that TBAA isn’t doing its job here? unsigned* should be distinct from unsigned** and TBAA should say that the pointers can’t alias?

I do not know the rule of TBAA well, but it seems the result of gcc’s alias analysis is different with llvm one. There is an example from my colleague @Momchil Velikov.

long **f(long **p, long ***q) {

*p = **q;

return *q;

}

It looks gcc generates different alias_set between long ** and long ***. It causes ssc value numbering pass removes a load from return q*. clang generates same TBAA MDNode for the store with the long ** and the load with long ***. If we add restrict keyword to the pointer type, there are different TBAA MDNodes and InstCombine pass removes the load instruction from return. I am not sure which alias result is correct between gcc and llvm… If someone knows about the rule of TBAA with this case or something like that, please let me know.

Thanks

JinGu Kang

~WRD000.jpg