# Question about a May-alias case

Hello All,

I have a question about a May-alias case. Let's look at one simple C example.

char *buf[4];
char c;
void test(int idx) {
char *a = buf[3 - idx];
char *b = buf[idx];
*a = *b;
c++;
*a = *b;
}

We can imagine the second "*a = *b" could be removed. Let's look at the IR snippet with -O3 for above example.

1 define void @test(i32 %idx) {
2 entry:
3 %sub = sub nsw i32 3, %idx
4 %idxprom = sext i32 %sub to i64
5 %arrayidx = getelementptr inbounds [4 x i8*], [4 x i8*]* @buf, i64 0, i64 %idxprom
6 %0 = load i8*, i8** %arrayidx, align 8
7 %idxprom1 = sext i32 %idx to i64
8 %arrayidx2 = getelementptr inbounds [4 x i8*], [4 x i8*]* @buf, i64 0, i64 %idxprom1
9 %1 = load i8*, i8** %arrayidx2, align 8
10 %2 = load i8, i8* %1, align 1
11 store i8 %2, i8* %0, align 1
12 %3 = load i8, i8* @c, align 1
13 %inc = add nsw i8 %3, 1
14 store i8 %inc, i8* @c, align 1
15 %4 = load i8, i8* %1, align 1
16 store i8 %4, i8* %0, align 1
17 ret void
18 }

You can see the '%2' and '%4' are same but the %4 is not removed because of May-alias. Let's look at the result of AliasSet.

Alias sets for function 'test':
Alias Set Tracker: 2 alias sets for 5 pointer values.
AliasSet[0x4e72580, 5] may alias, Mod/Ref Pointers: (i8** %arrayidx, 8), (i8** %arrayidx2, 8), (i8* %1, 1), (i8* %0, 1), (i8* @c, 1)
AliasSet[0x4e76ef0, 1] must alias, Ref forwarding to 0x4e72580

You can see May-alias with %1 and @c. It seems if pointer operand points double pointer or more level pointer, alias analysis can not go through it. There are some options or ways to avoid May-alias for above case? It is not easy to find them. If I missed something, please let me know.

Thanks,

JinGu Kang

In general, the second store can't be removed; one or more of the pointers in buf could point to c. Not sure what you're trying to show with this example.

(On a mostly unrelated note, I think you can remove the first store to *a in your testcase, but that doesn't really have anything to do with alias analysis.)

-Eli

Hi Eli,

Thanks for good comment! I missed to initalize the buf.

Let's slightly change the example as below.

char subbuf1[2];
char subbuf2[2];
char subbuf3[2];
char subbuf4[2];
char *buf[4] = {subbuf1, subbuf2, subbuf3, subbuf4};
char c;
void test(int idx) {
char *a = buf[3 - idx];
char *b = buf[idx];
*a = *b;
c++;
*a = *b;
}

I think we can say the 'buf' does not point 'c'.

the IR snippet with '-O3' is

@subbuf1 = common global [2 x i8] zeroinitializer, align 1
@subbuf2 = common global [2 x i8] zeroinitializer, align 1
@subbuf3 = common global [2 x i8] zeroinitializer, align 1
@subbuf4 = common global [2 x i8] zeroinitializer, align 1
@buf = local_unnamed_addr global [4 x i8*] [i8* getelementptr inbounds ([2 x i8], [2 x i8]* @subbuf1, i32 0, i32 0), i8* getelementptr inbounds ([2 x i8], [2 x i8]* @subbuf2, i32 0, i32 0), i8* getelementptr inbounds ([2 x i8], [2 x i8]* @subbuf3, i32 0, i32 0), i8* getelementptr inbounds ([2 x i8], [2 x i8]* @subbuf4, i32 0, i32 0)], align 16
@c = common local_unnamed_addr global i8 0, align 1

; Function Attrs: norecurse nounwind uwtable
define void @test(i32 %idx) local_unnamed_addr #0 {
entry:
%sub = sub nsw i32 3, %idx
%idxprom = sext i32 %sub to i64
%arrayidx = getelementptr inbounds [4 x i8*], [4 x i8*]* @buf, i64 0, i64 %idxprom
%0 = load i8*, i8** %arrayidx, align 8, !tbaa !1
%idxprom1 = sext i32 %idx to i64
%arrayidx2 = getelementptr inbounds [4 x i8*], [4 x i8*]* @buf, i64 0, i64 %idxprom1
%1 = load i8*, i8** %arrayidx2, align 8, !tbaa !1
%2 = load i8, i8* %1, align 1, !tbaa !5
store i8 %2, i8* %0, align 1, !tbaa !5
%3 = load i8, i8* @c, align 1, !tbaa !5
%inc = add nsw i8 %3, 1
store i8 %inc, i8* @c, align 1, !tbaa !5
%4 = load i8, i8* %1, align 1, !tbaa !5
store i8 %4, i8* %0, align 1, !tbaa !5
ret void
}

The AliasSet is

Alias sets for function 'test':
Alias Set Tracker: 2 alias sets for 5 pointer values.
AliasSet[0x5c85330, 5] may alias, Mod/Ref Pointers: (i8** %arrayidx, 8), (i8** %arrayidx2, 8), (i8* %1, 1), (i8* %0, 1), (i8* @c, 1)
AliasSet[0x5c84ab0, 1] must alias, Ref forwarding to 0x5c85330

How do you think about it? If I missed something, please let me know.

Thanks,
JinGu Kang

That doesn't help... the compiler still can't prove whether some other translation unit modifies buf.

If you declare buf as `char *const buf[4] = {subbuf1, subbuf2, subbuf3, subbuf4};`, then I guess you could prove that the pointers in buf don't point to c. But that's a rare pattern in practice, and it would be kind of expensive to analyze in BasicAA. Maybe if we add stateful AA to LLVM eventually.

-Eli

um… you are right… Thank you very much for good comment!