Xinliang David Li wrote:
Xinliang David Li wrote:
As simple as
void foo (int n, double *p, int *q)
{
for (int i = 0; i < n; i++)
*p += *q;
}
clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c
llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc
There's a couple things interacting here:
* clang -fstrict-aliasing -O2 does generate the TBAA info, but it
runs the optimizers without enabling the -enable-tbaa flag, so the
optimizers never look at it. Oops.
* clang -fstrict-aliasing -O0 does *not* generate the TBAA info in
the resulting .bc file. This is probably intended to speed up -O0
builds even if -fstrict-aliasing is set, but is annoying for
debugging what's going on under the hood.
* If clang -O2 worked by running 'opt' and 'llc' under the hood,
we could tell it to pass a flag along to them, but it doesn't. As it
stands, you can't turn -enable-tbaa on when running clang.
So, putting that together, one way to do it is:
clang -O2 -fstrict-aliasing foo.c -flto -c -o foo.bc
opt -O2 -enable-tbaa foo.bc foo2.bc
-o foo2.bc
llc -O2 -enable-tbaa foo2.bc -o foo2.s
at which point the opt run will hoist the loads into a loop
preheader. Sadly this runs the LLVM optimizers twice (once in clang
-O2 and once in opt) which could skew results.
Yes, I verified these steps work, but my head is spinning:
1) does -flto has the same effect as -emit-llvm ? FE emits llvm bitcode
and exit without invoking llvm backend?
Yes, -flto and -emit-llvm are synonyms.
2) why do you need to invoke both opt and llc -- I verified invoking
just llc is also fine.
"llc" is really just codegen; the only optimizations it does are ones that are naturally part of lowering from llvm IR to assembly. For example, that includes another run of loop invariant code motion because some loads may have been added -- such as a load of the GOT pointer -- which weren't there in the IR to be hoisted.
"opt" runs any IR pass. You can ask run a single optimization, for example "opt -licm" or you can run an analysis pass like scalar evolutions with "opt -analyze -scalar-evolution". This is where the bulk of LLVM's optimizations live.
3) more general question -- is opt just a barebone llc without invoking
any llvm passes? So why is there a need for two opt driver?
I think of it as opt transforms .bc -> .bc and llc transforms .bc -> .s.
Nick