The LLVA and LLVM papers motivate the GetElementPtr instruction by arguing that it abstracts implementation details, in particular pointer size, from the compiler. While it does this fine for pointer addresses, it does not manage it for address offsets. Consider the following code:
$ cat test.c
int main() {
int *x[2];
int **y = &x[1];
return (y - x);
}
$ llvm-gcc -O3 -c test.c -emit-llvm -o - | llvm-dis
; ModuleID = ‘’
target datalayout = “e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32”
target triple = “i686-pc-linux-gnu”
define i32 @main() nounwind {
entry:
%x = alloca [2 x i32*] ; <[2 x i32*]> [#uses=2]
%tmp1 = getelementptr [2 x i32]* %x, i32 0, i32 1 ; <i32**> [#uses=1]
%tmp23 = ptrtoint i32** %tmp1 to i32 ; [#uses=1]
%x45 = ptrtoint [2 x i32*]* %x to i32 ; [#uses=1]
%tmp6 = sub i32 %tmp23, %x45 ; [#uses=1]
%tmp7 = ashr i32 %tmp6, 2 ; [#uses=1]
ret i32 %tmp7
}
The return value is 1. The ashr exposes the pointer size by shifting the 4 byte distance over by 2.
For the analysis that I am doing, it would be nice to have an instruction that explicitly performs this distance calculation in a type-safe manner, irrespective of pointer size. Something like this:
define i32 @main() nounwind {
entry:
%x = alloca [2 x i32*] ; <[2 x i32*]> [#uses=2]
%tmp1 = getelementptr [2 x i32]* %x, i32 0, i32 1 ; <i32**> [#uses=1]
%tmp2 = getdistance i32** %x, %tmp1 ; [#uses=1]
ret i32 %tmp2
}
I’m not really a compiler person, so I’m wondering if a need for such an instruction ever arises in more compiler-oriented situations such as optimization or wrt. portability? Does the fact that pointer size is never completely hidden ever cause problems? Is GetElementPtr generally “good enough”? It would be nice to have a complete solution though, wouldn’t it? Thoughts?
Marc
I didn’t realize this before, but perhaps the fact that llvm-gcc was unable to optimize out the offset calculation at -O3 is sufficient evidence for supporting such an instruction. 
Marc
Idefine i32 @main() nounwind {
entry:
%x = alloca [2 x i32*] ; <[2 x i32*]*> [#uses=2]
%tmp1 = getelementptr [2 x i32*]* %x, i32 0, i32 1
; <i32**> [#uses=1]
%tmp23 = ptrtoint i32** %tmp1 to i32 ; <i32> [#uses=1]
%x45 = ptrtoint [2 x i32*]* %x to i32 ; <i32> [#uses=1]
%tmp6 = sub i32 %tmp23, %x45 ; <i32> [#uses=1]
%size = getelementptr i32** null, i32 1 ; <i32**> [#uses=1]
%sizeI = ptrtoint i32** %size to i32 ; <i32> [#uses=1]
%tmp7 = ashr i32 %tmp6, %sizeI ; <i32> [#uses=1]
ret i32 %tmp7
}
There, pointer size independent. The problem you see is you are using
a frontend targeting a specific platform, so pointersize is known (see
the target datalayout line).
Andrew
I should say what would be a nice instruction for type safety would be
%type x = getcontainerptr %pointer, %type, gep indexes
where %x is the beginning of the structure/array of which %pointer is
the member at the offset that would be calculated by a gep.
%x = alloca [2 x i32*]
%tmp1 = getelementptr [2 x i32*]* %x, i32 0, i32 1
%tmp2 = getcontainerptr i32** %tmp1, [2 x i32*]*, i32 0, i32 1
then %x == %tmp2
such an instruction would let you backtrack in a structure without
casts and pointer arithmetic.
Andrew
The LLVA and LLVM papers motivate the GetElementPtr instruction by arguing
that it abstracts implementation details, in particular pointer size, from
the compiler. While it does this fine for pointer addresses, it does not
manage it for address offsets. Consider the following code:
$ cat test.c
int main() {
int *x[2];
int **y = &x[1];
return (y - x);
}
The return value is 1. The ashr exposes the pointer size by shifting the 4
byte distance over by 2.
Right. A related issue is:
http://llvm.org/bugs/show_bug.cgi?id=2247
I didn't realize this before, but perhaps the fact that llvm-gcc was unable to optimize out the offset calculation at -O3 is sufficient evidence for supporting such an instruction. 
Sure it does:
$ llvm-gcc t.c -S -o - -O3 -fomit-frame-pointer
_main:
subl $8, %esp
movl $1, %eax
addl $8, %esp
ret
I agree that optimizing it before codegen time would be preferable, but adding a new instruction (by itself) doesn't handle this. It would be easy to add this to the current optimizer if we cared *shrug*.
-Chris