Startling optimization

$ cat test.c
int foo( int n )
{
  int a = 0, b = 0, c = 0, j;

  for ( j = 0; j < n; j++ )
  {
    a++;
    b = a | c;
    c = a;
  }
  return b;
}

$ clang -O -S test.c

_foo:
  pushq %rbp
  movq %rsp, %rbp
  testl %edi, %edi
  jle LBB0_2
  leal -1(%rdi), %eax
  orl %edi, %eax
  popq %rbp
  ret
LBB0_2:
  xorl %eax, %eax
  popq %rbp
  ret

Which parts of clang/llvm source should I read to learn how the loop is magically optimized away?

Robert P.

you may try the following passes that will optimize the loop away.

clang -emit-llvm -S -o - test.c | opt -mem2reg -loop-rotate -instcombine -indvars -loop-deletion -S

- xi

Look at LLVM's ScalarEvolution analysis. It's invoked by the loop optimizer passes and performs the "magic" computations that allow static execution of loops.

Hi Robert,

Benjamin Kramer and Xi Wang are both right. The magic trick requires a few steps:

1. -mem2reg and -loop-rotate put the code in a canonical form for optimization
2. -instcombine moves 'b = a | c' below the loop, since 'b' has no use in the loop
3. -indvars recognizes 'a' and 'c' as recurrences within the loop (ScalarEvolution), identifies the loop exit condition, and substitutes each use of a recurrence outside the loop with its value at loop exit.
4. -loop-deletion recognizes that the loop has no side effects and no variables defined in the loop are used outside the loop.

-Andy

Andrew Trick wrote:

$ cat test.c
int foo( int n )
{
int a = 0, b = 0, c = 0, j;

for ( j = 0; j < n; j++ )
{
  a++;
  b = a | c;
  c = a;
}
return b;
}

$ clang -O -S test.c

_foo:
  pushq %rbp
  movq %rsp, %rbp
  testl %edi, %edi
  jle LBB0_2
  leal -1(%rdi), %eax
  orl %edi, %eax
  popq %rbp
  ret
LBB0_2:
  xorl %eax, %eax
  popq %rbp
  ret

Which parts of clang/llvm source should I read to learn how the loop is magically optimized away?

Benjamin Kramer and Xi Wang are both right. The magic trick requires a few steps:

1. -mem2reg and -loop-rotate put the code in a canonical form for optimization
2. -instcombine moves 'b = a | c' below the loop, since 'b' has no use in the loop
3. -indvars recognizes 'a' and 'c' as recurrences within the loop (ScalarEvolution), identifies the loop exit condition, and substitutes each use of a recurrence outside the loop with its value at loop exit.
4. -loop-deletion recognizes that the loop has no side effects and no variables defined in the loop are used outside the loop.

Thank you. I want to make a list of "clang beats gcc" examples, useful for persuading reluctant adopters to try clang. This will be the first entry:-)

Robert P.

Hi,

i wonder if there is a way to retrieve the return type of a template
function. The function clang_getResultType works for cursors, whose
'kind' property is CXCursor_FunctionDecl. For a
CXCursor_FunctionTemplate it says, 'CXType_Invalid'. Can someone tell me
the reason for this behavior?

Thanks,
Tobias