Missed optimization opportunity

I recently downloaded LLVM 2.8 and started playing with the optimizations a bit.
I saw something curious while trying the following function:

int g(unsigned int a) {
unsigned int c[100];
c[10] = a;
c[11] = a;
unsigned int b = c[10] + c[11];

if(b > a*2) a = 4;
else a = 8;
return a + 7;
}

The generated code, with -O3 activated, is

define i32 @g(i32 a) nounwind readnone {
%add = shl i32 %a, 1
%mul = shl i32 %a, 1
%cmp = icmp ugt i32 %add, %mul
%a.addr.0 = select i1 %cmp, i32 11, i32 15
ret i32 %a.addr.0
}

I find it strange that it hasn’t found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If ‘a’ is replaced by a constant it works.

I’m also curios which pass detects that c[10] and c[11] are ‘a’ in ‘b = c[10] + c[11]’ (it isn’t instcombine, at -O1 getelementptr/load are still there).

I recently downloaded LLVM 2.8 and started playing with the optimizations a bit.
I saw something curious while trying the following function:

int g(unsigned int a) {
  unsigned int c[100];
  c[10] = a;
  c[11] = a;
  unsigned int b = c[10] + c[11];
  
  if(b > a*2) a = 4;
  else a = 8;
  return a + 7;
}

The generated code, with -O3 activated, is

define i32 @g(i32 a) nounwind readnone {
       %add = shl i32 %a, 1
       %mul = shl i32 %a, 1
       %cmp = icmp ugt i32 %add, %mul
       %a.addr.0 = select i1 %cmp, i32 11, i32 15
       ret i32 %a.addr.0
}

I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works.

You're right, that is a missed optimization. I added it to the missed optimization notes in r122603. Did this come from a larger example, or was this just a test?

I'm also curios which pass detects that c[10] and c[11] are 'a' in 'b = c[10] + c[11]' (it isn't instcombine, at -O1 getelementptr/load are still there).

There are several capable of picking this up, but GVN+MemDep is probably what you want.

-Chris

I find it strange that it hasn't found that %add and %mul have the same value, %cmp would be then false, selecting and returning 15. If 'a' is replaced by a constant it works.

You're right, that is a missed optimization. I added it to the missed optimization notes in r122603. Did this come from a larger example, or was this just a test?

Just as a note, if you run this example through the opt tool as well then the output is the expected "ret i32 15".

  > clang -O3 -S -emit-llvm -o - example.c |opt -std-compile-opts -o - |llvm-dis
  
  target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32"
  target triple = "i386-apple-darwin10.0.0"
  
  define i32 @g(i32 %a) nounwind readnone ssp {
  entry:
    ret i32 15
  }

Right, GVN catches this one. The problem is this optimization opportunity on the original testcase is only available after
gvn has run.

before gvn:
  %tmp5 = load i32* %arrayidx4, align 8
  %add = add i32 %tmp5, %a
  %mul = shl i32 %a, 1
  %cmp = icmp ugt i32 %add, %mul

gvn figures out that the load is equal to %a:
  %add = add i32 %a, %a
  %mul = shl i32 %a, 1
  %cmp = icmp ugt i32 %add, %mul

instcombine then canonicalizes the add into a shl, too late for GVN to CSE it.