Confusion about excessive reads in the PowerPC assembly generated by LLVM

I was trying to do liveness analysis on some code and I found that almost no variables were ever live on exit from any basic block.

Looking into the matter, I saw that values almost always get stored at then end of a block and loaded again at the beginning of the next block. This occurs in both the intermediate representation and also in the final compiled code (I am using the PowerPC backend).

Are these unnecessary loads a “feature” of LLVM? Or should I be compiling with some flags to make these spurious loads go away?


As an example of what I am talking about, I contrived a test case with 8 live variables between basic blocks:
#include <stdio.h>
main()
{ int s,t,u,v,w,x,y,z;
for (z=0; z<200; z++)
s += ((((t*z+u)*z+v)*z+w)*z+x)z+y; // loopbody1
for (z=10; z<20; z++)
s += ((((t
z-u)*z-v)*z-w)*z-x)*z-y; // loopbody2
}

Now, the assembly that is produced has 21 loads, which indicates no effort to reuse any data.
In particular, the two basic blocks that correspond to loopbody1 and to loopbody2 are identical (except that the adds become subfs).
Being identical, both of these generated basic blocks have 8 loads in their first 11 instructions. These 8 loads will fetch the eight variables.
But, of course, by the time loopbody2 is reached, some of these variables are already in registers and so they did not need to be reloaded.
Moreover, there is no effort to perform loop invariant code motion to push the loads up.

All of this makes me think that I am just plain not using the compiler right. Here is how I compiled it:
…/llvm-gcc -emit-llvm -c test.c -o test.bc
./llc -march=ppc32 -mattr=altivec -O3 test.bc -o test.s

Thank you for your time.

You are not running the mid-level optimizers. Either run the bitcode through opt or pass -O3 to llvm-gcc.

/jakob