# Unrolling power sum calculations into constant time expressions

Hello,

I noticed that feeding 'clang -O3' with functions like:
int sum1(int x) {
int ret = 0;
for(int i = 0; i < x; i++)
ret += i;
return ret;
}
int sum2(int x) {
int ret = 0;
for(int i = 0; i < x; i++)
ret += i*i;
return ret;
}
...
int sum20(int x) {
int ret = 0;
for(int i = 0; i < x; i++)
ret += i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i;
return ret;
}
etc.
Makes LLVM unroll all those loops into constant time expressions!

A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a polynomial of degree k+1, but it is not a trivial derivation (I wouldn't do it by hand for k > 2) and I thought that someone hardcoded such a case into llvm loop optimizer, and I've been looking through the loop analysis and loop passes code, but couldn't find it anywhere...
And what llvm generates doesn't look like hard-coded formulas for sums of powers, because it doesn't do it as optimally as it could, e.g. sum20(x) could be expressed as a 21-degree polynomial with constant coefficients, so it could be compiled to about 21 muls, 21 adds and 1 div. Yet, llvm generates about 300 instructions -- but no looping, constant time regardless of x!

So I suppose the unrolling of this loop is made by a combination of simpler optimizations... which is just... "wow".
I've been looking through the analyses and passes code and have no clue what may be able to make such great optimizations -- I am really impressed by the derivation power of this.
Any clue what can trigger such optimizations (I am totally new to llvm code...)?

Cheers,
Juliusz Sompolski

My guess is something based on Scalar Evolution.

Juliusz Sompolski wrote:

Hello,

I noticed that feeding 'clang -O3' with functions like:
int sum1(int x) {
int ret = 0;
for(int i = 0; i< x; i++)
ret += i;
return ret;
}
int sum2(int x) {
int ret = 0;
for(int i = 0; i< x; i++)
ret += i*i;
return ret;
}
...
int sum20(int x) {
int ret = 0;
for(int i = 0; i< x; i++)
ret += i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i*i;
return ret;
}
etc.
Makes LLVM unroll all those loops into constant time expressions!

A sum \sum_{i=0}^{n} i^k can be derived into a formula which is a
polynomial of degree k+1, but it is not a trivial derivation (I wouldn't
do it by hand for k> 2)

See lib/Analysis/ScalarEvolution.cpp's BinomialCoefficient(). Notably the comment near the top of the function. LLVM IR permits arbitrary bitwidth integer arithmetic which the backends lower to math that is actually supported on the target.

Nick

and I thought that someone hardcoded such a