How to use LLVM optimizations with clang

Hello everyone

I am trying to use some LLVM optimizations like -die or -adce. Is it
possible to use them along clang?

Or is there a way where these optimization can be passed on to "opt"
tool through clang, if opt is being used by clang behind the scenes?

Thanks alot

Regards

Shahzad

Hello everyone

I am trying to use some LLVM optimizations like -die or -adce. Is it
possible to use them along clang?

Or is there a way where these optimization can be passed on to "opt"
tool through clang, if opt is being used by clang behind the scenes?

No, opt only works on llvm IR/bitcode. You can generate it like this:
clang -c foo.c -emit-llvm -o foo.bc
or
clang -S foo.c -emit-llvm -o foo.ll

Then you can run the optimization(s):
opt -adce foo.bc -o foo-adce.bc

Then you can compile using clang
clang -c foo-adce.bc -o foo-adce.o

Chad

Thanks alot Chad for quick response. Does this means that, we can not
use LLVM optimizations except O1, O2, O3, O4 and unroll-loops with
clang?

One more thing I would like to know that If I want to process multiple
modules with opt at the same time like

opt -adce *.bc

then how is it possible with opt in one go, if I process all the
bytecode files within Makefile.

Thanks.

Shahzad

Thanks alot Chad for quick response. Does this means that, we can not
use LLVM optimizations except O1, O2, O3, O4 and unroll-loops with
clang?

Try using the -debug-pass=Arguments options to see what passes are being run at each optimization level.

E.g.,
clang -O[0-3] -mllvm -debug-pass=Arguments foo.c.

One more thing I would like to know that If I want to process multiple
modules with opt at the same time like

opt -adce *.bc

I don't think this will work.

then how is it possible with opt in one go, if I process all the
bytecode files within Makefile.

You should be able to define a rule in the Makefile to compile your bitcode/IR files.

Chad

Hello Duncan

Is it possible that we can use LLVM optimization beside O1, O2, O3
along with dragonegg plugin?

Regards

Shahzad

Hi Shahzad,

Is it possible that we can use LLVM optimization beside O1, O2, O3
along with dragonegg plugin?

sure, try this:

   gcc -fplugin=path/dragonegg.so ...other_options_here... -S -o - -fplugin-arg-dragonegg-emit-ir -fplugin-arg-dragonegg-llvm-ir-optimize=0 | opt -pass1 -pass2 ...

Here -fplugin-arg-dragonegg-emit-ir tells it to output LLVM IR rather than
target assembler. You can also use -flto here.

-fplugin-arg-dragonegg-llvm-ir-optimize=0 disables the standard set of LLVM
optimizations.

In general, if a front-end can produce LLVM IR then you can do this, by
outputting the IR and passing it to "opt".

Ciao, Duncan.

Hello Duncan

I tried your method and it works fine. What would be the next step to
produce the final executable? I have tried the following but it is
producing an error

$ gcc -fplugin=/path/to/dragonegg.so -S *.c
-fplugin-arg-dragonegg-emit-ir | opt -adce

$ clang *.s

Regards

Shahzad

Hi Shahzad,

I tried your method and it works fine. What would be the next step to
produce the final executable? I have tried the following but it is
producing an error

$ gcc -fplugin=/path/to/dragonegg.so -S *.c
-fplugin-arg-dragonegg-emit-ir | opt -adce

this won't work because you aren't passing the IR to opt (you need -o - for
that if using a pipe) and you aren't doing anything with opt output. What's
more, you are trying to compile multiple files at once. Probably something
like this would work:

   for F in *.c ; do B=`basename $F .c` ; gcc -fplugin=/path/to/dragonegg.so -S -o - $F -fplugin-arg-dragonegg-emit-ir | opt -adce -o $B.ll ; done
   clang *.ll

Ciao, Duncan.

Hello Duncan

Hi Shahzad,

I tried your method and it works fine. What would be the next step to
produce the final executable? I have tried the following but it is
producing an error

$ gcc -fplugin=/path/to/dragonegg.so -S *.c
-fplugin-arg-dragonegg-emit-ir | opt -adce

this won't work because you aren't passing the IR to opt (you need -o - for
that if using a pipe) and you aren't doing anything with opt output. What's
more, you are trying to compile multiple files at once. Probably something
like this would work:

I tried it with -o - but its producing an error

gcc: fatal error: cannot specify -o with -c, -S or -E with multiple files

What you suggest?

Regards

Abdul

Hi,

I tried it with -o - but its producing an error

gcc: fatal error: cannot specify -o with -c, -S or -E with multiple files

What you suggest?

what I wrote:

  for F in *.c ; do B=`basename $F .c` ; gcc -fplugin=/path/to/dragonegg.so
-S -o - $F -fplugin-arg-dragonegg-emit-ir | opt -adce -o $B.ll ; done
  clang *.ll

Thanks to the for loop and passing $F to gcc, you are no longer using gcc with
multiple files. So if you are getting that message then you are don't what I
suggested.

Ciao, Duncan.

Hello Duncan

Sorry for the mistake. Actually that error occurred when I was
compiling all the files at once, NOT in for loop.

The for loop is working perfectly as it is dealing with individual
files. I have now one new issue. Let me specify it briefly.

If I compile the program using the following command line i.e.

$ clang -O3 -lm *.c

then

$ time ./a.out

real 0m2.606s
user 0m2.584s
sys 0m0.012s

BUT, if I use all the optimizations enabled with -O3 but specify them
explicity i.e.

for F in *.c ; do B=`basename $F .c` ; gcc
-fplugin=/damm/compilers/dragonegg-3.1.src/dragonegg.so -S -o - $F
-fplugin-arg-dragonegg-emit-ir | opt -targetlibinfo -no-aa -tbaa
-basicaa -globalopt -ipsccp -deadargelim -instcombine -simplifycfg
-basiccg -prune-eh -inline -functionattrs -argpromotion
-scalarrepl-ssa -domtree -early-cse -simplify-libcalls
-lazy-value-info -jump-threading -correlated-propagation -simplifycfg
-instcombine -tailcallelim -simplifycfg -reassociate -domtree -loops
-loop-simplify -lcssa -loop-rotate -licm -lcssa -loop-unswitch
-instcombine -scalar-evolution -loop-simplify -lcssa -indvars
-loop-idiom -loop-deletion -loop-unroll -memdep -gvn -memdep
-memcpyopt -sccp -instcombine -lazy-value-info -jump-threading
-correlated-propagation -domtree -memdep -dse -adce -simplifycfg
-instcombine -strip-dead-prototypes -globaldce -constmerge -preverify
-domtree -verify -o $B.ll ; done

$ clang *.ll

then

time ./a.out

real 0m7.791s
user 0m7.760s
sys 0m0.008s

Am I missing something here?

Though directly compiling files i.e.

$ clang *.c

results in

time ./a.out

real 0m10.167s
user 0m10.121s
sys 0m0.016s

Regards

Shahzad

Hi,

If I compile the program using the following command line i.e.

$ clang -O3 -lm *.c

this may be doing link time optimization.

then

$ time ./a.out

real 0m2.606s
user 0m2.584s
sys 0m0.012s

BUT, if I use all the optimizations enabled with -O3 but specify them
explicity i.e.

you can just use "opt -O3" here.

for F in *.c ; do B=`basename $F .c` ; gcc
-fplugin=/damm/compilers/dragonegg-3.1.src/dragonegg.so -S -o - $F
-fplugin-arg-dragonegg-emit-ir | opt -targetlibinfo -no-aa -tbaa
-basicaa -globalopt -ipsccp -deadargelim -instcombine -simplifycfg
-basiccg -prune-eh -inline -functionattrs -argpromotion
-scalarrepl-ssa -domtree -early-cse -simplify-libcalls
-lazy-value-info -jump-threading -correlated-propagation -simplifycfg
-instcombine -tailcallelim -simplifycfg -reassociate -domtree -loops
-loop-simplify -lcssa -loop-rotate -licm -lcssa -loop-unswitch
-instcombine -scalar-evolution -loop-simplify -lcssa -indvars
-loop-idiom -loop-deletion -loop-unroll -memdep -gvn -memdep
-memcpyopt -sccp -instcombine -lazy-value-info -jump-threading
-correlated-propagation -domtree -memdep -dse -adce -simplifycfg
-instcombine -strip-dead-prototypes -globaldce -constmerge -preverify
-domtree -verify -o $B.ll ; done

$ clang *.ll

Try clang -O3 *.ll

If it makes a difference then that means that clang is linking all these files
together to form one mega file which it is then optimizing.

Ciao, Duncan.

Thanks Duncan

It was really helpful.

Regards

Abdul

Hello

I need some help here please.

If we compile source files directly in to native code:
$ clang -O3 -lm *.c

then the runtime is like following

real 0m2.807s
user 0m2.784s
sys 0m0.012s

and If we emit LLVM bytcode and apply optimizations

$ clang -O3 -c -emit-llvm *.c
$ llvm-link *.o -o comb.ll
$ time lli ./comb.ll

then the runtime is

real 0m2.671s
user 0m2.640s
sys 0m0.020s

But, if I convert this same file comb,ll in to native binary

$ clang comb.ll

and execute it, then the runtime increases alot

$ time ./a.out

real 0m8.008s
user 0m7.964s

The binary generated directly by clang have runtime of around 2
seconds while generated after LLVM IR have around 8 secs but LLVM
bytecode having same optimizations when executed with lli have runtime
of around 2 secs.

What steps are exactly taken by clang to produce the binary in order
to achieve the runtime of around 2 secs. If it is a question of link
time optimizations then how can we achieve that?

Best Regards

Shahzad

Hi, is the comb.ll used here:

$ time lli ./comb.ll

then the runtime is

real 0m2.671s
user 0m2.640s
sys 0m0.020s

But, if I convert this same file comb,ll in to native binary

the same as the comb.ll used here:

$ clang comb.ll

?

Ciao, Duncan.

Hi

Yes, they both are exactly the same.

Regards

Shahzad

Hi,

Yes, they both are exactly the same.

then I don't know what is going on. I suggest you send a copy of comb.ll to the
list so that we can see for ourselves.

Ciao, Duncan.

Sure. The comb.ll and data files are attached and can be invoked as
the following

$ lli comb.ll data -c

Regards

Shahzad

comb.ll (90.4 KB)

data (264 KB)

Hi, the reason is that lli does optimized code generation by default, while
clang does unoptimized codegen by default. Use this
   clang -O2 comb.ll
instead.

Ciao, Duncan.

Hi

Yes. But how exactly code generation (optimized one) be done without
clang. Is it possible that we can specify those optimization
(individual ones instead of standard ones like -O3) some how when
generating code as is done by clang or llc?

Regards

Shahzad