Question about Alias Analysis with restrict keyword

Hello All,

I have met a case with restrict keyword and I have a question about it. Let's look at a simple example.

char buf[4];

void test(char *restrict a, char *restrict b, int count) {
for (unsigned i = 0; i < count; i++) {
*a = *b;
a++;
b++;
buf[i] = i;
}
}

I think there are no aliasing among pointers such as 'a', 'b' and 'buf' because of the restrict keyword.

The IR snippet with '-O1' is as following:

@buf = common local_unnamed_addr global [4 x i8] zeroinitializer, align 1

; Function Attrs: norecurse nounwind uwtable
define void @test(i8* noalias nocapture, i8* noalias nocapture readonly, i32) local_unnamed_addr #0 {
%4 = icmp eq i32 %2, 0
br i1 %4, label %7, label %5

; <label>:5: ; preds = %3
%6 = zext i32 %2 to i64
br label %8

; <label>:7: ; preds = %8, %3
ret void

; <label>:8: ; preds = %8, %5
%9 = phi i64 [ 0, %5 ], [ %17, %8 ]
%10 = phi i8* [ %0, %5 ], [ %13, %8 ]
%11 = phi i8* [ %1, %5 ], [ %14, %8 ]
%12 = load i8, i8* %11, align 1, !tbaa !2
store i8 %12, i8* %10, align 1, !tbaa !2
%13 = getelementptr inbounds i8, i8* %10, i64 1
%14 = getelementptr inbounds i8, i8* %11, i64 1
%15 = trunc i64 %9 to i8
%16 = getelementptr inbounds [4 x i8], [4 x i8]* @buf, i64 0, i64 %9
store i8 %15, i8* %16, align 1, !tbaa !2
%17 = add nuw nsw i64 %9, 1
%18 = icmp eq i64 %17, %6
br i1 %18, label %7, label %8
}

As you can see, there are noalias attribute with arguments and we can expect the alias analysis will consider them.

The Alias Set is as below.

Alias sets for function 'test':
Alias Set Tracker: 2 alias sets for 3 pointer values.
AliasSet[0x4a1e3a0, 3] may alias, Mod/Ref Pointers: (i8* %11, 1), (i8* %10, 1), (i8* %16, 1)
AliasSet[0x4a1e440, 1] must alias, Mod forwarding to 0x4a1e3a0

Alias analysis returns 'May- alias' for above code. When I look at the alias analysis code, below code causes may-alias.

On BasicAAResult::aliasGEP()

1297 // Check to see if these two pointers are related by the getelementptr
1298 // instruction. If one pointer is a GEP with a non-zero index of the other
1299 // pointer, we know they cannot alias.
1300
1301 // If both accesses are unknown size, we can't do anything useful here.
1302 if (V1Size == MemoryLocation::UnknownSize &&
1303 V2Size == MemoryLocation::UnknownSize)
1304 return MayAlias;
1305
1306 AliasResult R = aliasCheck(UnderlyingV1, MemoryLocation::UnknownSize,
1307 AAMDNodes(), V2, MemoryLocation::UnknownSize,
1308 V2AAInfo, nullptr, UnderlyingV2);

On line 1306, we can see the V1Size and V2Size are set up with "MemoryLocation::UnknownSize" and it causes may-alias. On line 1302 and 1033, there are checking code for "MemoryLocation::UnknownSize"... I am not sure whether it is bug or not... Can someone let me know why it passes "MemoryLocation::UnknownSize" please? If I missed something, please let me know.

Thanks,

JinGu Kang

On BasicAAResult::aliasGEP()

1297 // Check to see if these two pointers are related by the getelementptr
1298 // instruction. If one pointer is a GEP with a non-zero index of the other
1299 // pointer, we know they cannot alias.
1300
1301 // If both accesses are unknown size, we can't do anything useful here.
1302 if (V1Size == MemoryLocation::UnknownSize &&
1303 V2Size == MemoryLocation::UnknownSize)
1304 return MayAlias;

The check for UnknownSize is here because it's impossible to disambiguate the offsets of two pointers which both have unknown size, so the rest of BasicAAResult::aliasGEP can't do anything useful.

1305
1306 AliasResult R = aliasCheck(UnderlyingV1, MemoryLocation::UnknownSize,
1307 AAMDNodes(), V2, MemoryLocation::UnknownSize,
1308 V2AAInfo, nullptr, UnderlyingV2);

On line 1306, we can see the V1Size and V2Size are set up with "MemoryLocation::UnknownSize" and it causes may-alias. On line 1302 and 1033, there are checking code for "MemoryLocation::UnknownSize"... I am not sure whether it is bug or not... Can someone let me know why it passes "MemoryLocation::UnknownSize" please? If I missed something, please let me know.

UnderlyingV1 is the base of the GEP. So this is asking whether the base of the GEP aliases V2. It uses UnknownSize because the offset is unknown; it can't tell which bytes are accessed, so it's using an approximation instead.

We could theoretically try to be a bit more clever here if the size is known and the GEP offset is constant: we could instead pass V1Size+GEP1BaseOffset. Not sure that would have much effect in practice, though.

-Eli

Hi Eli,

I appreciate your good comments. It is very helpful.

Thanks again,

JinGu Kang

Ah, the recursive phi and gep caused the may-alias... Eli made my brain re-work. Thank you very much! To be sure, If we add "-basicaa-recphi" option, the example does not generate may-alias with restrict keyword.