GEP index type

Hi,

I have a question regarding the LLVM-IR which is generated by clang. When I compile the code below on a 64-bit system, clang generates a GEP instruction with a 64 bit index.

int demo(int *A, short i) { return A[i]; }

becomes:

%4 = sext i16 %3 to i64
%6 = getelementptr inbounds i32* %5, i64 %4

I would expect to see an i16 index, since the index is known to be within the 0..16 bit range. Why does clang widen the index to a 64bit value ?

Thanks,
Nadav

Primarily because this exposes the sign extension to the optimizer,
which is generally good; iIRC, instcombine does the same thing.

Also, in some cases we have to do the extension explicitly to preserve
the IR semantics: for example, suppose "i" was declared as an unsigned
short.

-Eli

Eli,

I understand the need to widen unsigned types. However, I ran into a problem with the current GEP/subscript that clang has.

AVX2 gather instructions rely on a 64-bit base pointer and a vector of 32-bit indices. Usually, when vectorizing programs, it is possible to detect that the GEP base pointer is uniform and that the index is variant (and needs to be vectorized). This works really nice for 32-bit programs, except that on 64-bit systems clang drops the information that the index is a 32-bit value. If I had this information then I would have been able to use the AVX2 instruction in many cases.

Do you see a way to overcome this problem ? Since InstCombiner has this optimization, can we talk about dropping this optimization in clang ?

Thanks,
Nadav

Hi Nadav,

I understand the need to widen unsigned types. However, I ran into a problem with the current GEP/subscript that clang has.

AVX2 gather instructions rely on a 64-bit base pointer and a vector of 32-bit indices. Usually, when vectorizing programs, it is possible to detect that the GEP base pointer is uniform and that the index is variant (and needs to be vectorized). This works really nice for 32-bit programs, except that on 64-bit systems clang drops the information that the index is a 32-bit value. If I had this information then I would have been able to use the AVX2 instruction in many cases.

Do you see a way to overcome this problem ? Since InstCombiner has this optimization, can we talk about dropping this optimization in clang ?

I think this is an LLVM optimization: GEP indices are extended to an index type
with the same size as a pointer.

Ciao, Duncan.

Duncan,

Eli said that Clang also has this optimization. Also, try to compile this code using clang w/o optimizations:

int demo(int *A, short i) { return A[i]; }

The generated GEP will have i64 indices.

Thanks,
Nadav