Hello,

I noticed that feeding 'clang -O3' with functions like:

int sum1(int x) {

int ret = 0;

for(int i = 0; i < x; i++)

ret += i;

return ret;

}

int sum2(int x) {

int ret = 0;

for(int i = 0; i < x; i++)

ret += i*i;

return ret;

}

...

int sum20(int x) {

int ret = 0;

for(int i = 0; i < x; i++)

ret += i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i;

return ret;

}

etc.

Makes LLVM unroll all those loops into constant time expressions!

A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a polynomial of degree k+1, but it is not a trivial derivation (I wouldn't do it by hand for k > 2) and I thought that someone hardcoded such a case into llvm loop optimizer, and I've been looking through the loop analysis and loop passes code, but couldn't find it anywhere...

And what llvm generates doesn't look like hard-coded formulas for sums of powers, because it doesn't do it as optimally as it could, e.g. sum20(x) could be expressed as a 21-degree polynomial with constant coefficients, so it could be compiled to about 21 muls, 21 adds and 1 div. Yet, llvm generates about 300 instructions -- but no looping, constant time regardless of x!

So I suppose the unrolling of this loop is made by a combination of simpler optimizations... which is just... "wow".

I've been looking through the analyses and passes code and have no clue what may be able to make such great optimizations -- I am really impressed by the derivation power of this.

Any clue what can trigger such optimizations (I am totally new to llvm code...)?

Cheers,

Juliusz Sompolski