Missed opportunity in the midend, unsigned comparison

Hi everybody, I see a missed optimization opportunity in LLVM that GCC catches and I’d love to hear community’s input.
Here’s the original C code:

1 char arr[2];

2 char *get(unsigned ind) {

3 if (ind >= 1) {

4 return 0;

5 }

6 return &(arr[ind]);

7 }

The variable ind is unsigned so, based on the comparison, if it is not greater or equals to one, than it is must be equal to zero. GCC understands that ind equals to zero at line 6 and generates something like the following (in pseudocode, the x86 assembly is in the footnotes):

ret = 0

if ind == 0:

ret = arr

return ret

On the other hand, the development version of LLVM produces the following IR:

; Function Attrs: nounwind

define i8* @get(i32 %ind) local_unnamed_addr #0 {


%cmp = icmp eq i32 %ind, 0

%arrayidx = getelementptr inbounds [2 x i8], [2 x i8]* @arr, i32 0, i32 %ind

%retval.0 = select i1 %cmp, i8* %arrayidx, i8* null

ret i8* %retval.0


The variable arrayidx is always calculated even though we could simply return arr when ind equals to zero.

Is this kind of optimization already implemented somewhere in LLVM? If not, what is the best place to implement it at? Thank you very much in advance.



For whatever it may be worth, __builtin_assume() helps generate the right code, so it seems like this optimization is within reach:

char arr[2];
char *get(unsigned ind) {
if (ind >= 1) {
return 0;
__builtin_assume(ind < 1);
return &(arr[ind]);