I was hoping to get some clarification on the aim of the GVN pass. I have been reading a bit about global value numbering, and from what I understand, there are polynomial time algorithms for removing duplicate computations from loops.
I have an example program which computes the sum of an array twice, in two separate accumulators.
Here, sum0 and sum1 are both sums of the array A.
void b01_sums(int size, int* A)
int sum0 = 0;
int sum1 = 0;
for (int i = 0; i != size; ++i)
sum0 = sum0 + A[i];
sum1 = sum1 + A[i];
printf(“sum0: %d\nsum1: %d\n”, sum0, sum1);
I would have expected global value numbering to see that sum0 and sum1 are initialised and updated with the same value, and so merge them into a single accumulator. However, the assembly output still contains both accumulators:
$ gcc sums.c -c -O3 -save-temps
. . .
movl (%r15), %eax
addl %eax, %ebx
addl %eax, %r13d
I have tried a few variations on the C code: pulling out the array reads into a single binding makes no difference, while moving the printf outside of the loop allows the loop to be vectorised, but there are still two accumulators.
I am wondering whether I am missing some particular invocation (perhaps to enable fixpoints over loops in the GVN analysis?), or otherwise, if there is some reason that means this sort of analysis is impractical for real programs.
I am using the Apple clang 6, but also tried with llvm opt 3.6:
$ gcc --version
Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
Thread model: posix
$ /usr/local/opt/llvm/bin/opt --version
LLVM version 3.6.2
Built Apr 10 2016 (02:50:09).
Default target: x86_64-apple-darwin13.4.0
Host CPU: core-avx2
: Gulwani & Necula: A Polynomial-Time Algorithm for Global Value Numbering http://www.eecs.berkeley.edu/~necula/Papers/gvndet-journal06.pdf
: Example code and assembly output https://gist.github.com/amosr/06938ee5becaf5249de5634499bc1add