Hi,
I've prepared patches for both LLVM and Clang to change the
datalayout for R600. This may seem like a bold move, but I think it is
warranted. R600/SI is a strange architecture in that it uses 64bit
pointers but does not support 64 bit arithmetic except for load/store
operations that roughly map onto getelementptr.
The current datalayout for r600 includes n32:64, which is odd
because r600 cannot actually do any 64bit arith natively. This causes
particular problems in the optimizer with the following kernel:
__kernel void if_eq(__global unsigned long* out, unsigned arg0)
{
int i=0;
for(i = 0; i < arg0; i++){
out[i] = i;
}
}
Clang decides that instead of adding a sext i32 %i to i64 before
getelementptr, it would be best to just go ahead and promote the
variable i to i64. Which would be all good if i64 was actually a
native type.
By changing the native types to n32 *only*. clang emits better code
for r600, such as
; Function Attrs: nounwind
define void @if_eq(i64 addrspace(1)* nocapture %out, i32 %arg0) #0 {
entry:
%cmp4 = icmp eq i32 %arg0, 0
br i1 %cmp4, label %for.end, label %for.body
for.body: ; preds = %for.body, %entry
%i.05 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
%conv = sext i32 %i.05 to i64
%arrayidx = getelementptr inbounds i64 addrspace(1)* %out, i64 %conv
store i64 %conv, i64 addrspace(1)* %arrayidx, align 8, !tbaa !2
%inc = add nsw i32 %i.05, 1
%exitcond = icmp eq i32 %inc, %arg0
br i1 %exitcond, label %for.end, label %for.body
for.end: ; preds = %for.body, %entry
ret void
}
Another upside to this is that i64 addition on r600 can even be
enabled as it the lowering code need only be called for arithmetic and
not pointers, for which it actually works. In the future, r600 will
require more patches wrt this issue as the old IR generated by clang
was perfectly valid but crashes llc.
For now, I think this patch is a good solution because it makes better
code and allows me to compile programs much longer than my previous 3
line record Not so much a bandaid as a something else that needed
to be done.
Regards,
Jon Pry
jonpry@gmail.com
0001-R600-Fix-pointer-arithmetic.txt (1.48 KB)
0001-R600-Change-datalayout.txt (2.84 KB)