Question on flags/ passes selection to improve code size

Hi,

I am looking at obtaining smaller code sizes for ARM cores. So far, I am
trying a  genetic algorithm to find the best option for the opt passes  and
clang/ llc flags.
I am looking at what flags and passes -Oz and the other optimization levels
enable and use. Having done something similar for gcc, I was looking for a
similar approach.
However, I was not able to find many optimization flags for clang or llc,
which made me think that in clang/llvm the optimization changes are done
mostly by selecting/removing the opt passes(flags).

In order to run what opt passes I select, I split the compilation process
in:

clang  CLANG_FLAGS -emit-llvm  mysource1.c  -c -o mysource1.bc
opt  OPT_FLAGS  mysource1.bc -o mysource1.ll
llc LLC_FLAGS mysource1.ll -filetype=obj -o mysource1.o

I have also seen that for -Oz for example, the Pass Arguments appears
multiple time, does this mean that opt is run multiple times with different
passes options?

Now, my general direction questions are:

Am I on the right track with this?
Do you have any pointers or advice on this?

Thank you,
Robert

I am looking at obtaining smaller code sizes for ARM cores. So far, I am
trying a genetic algorithm to find the best option for the opt passes and
clang/ llc flags.

Nice.

I am looking at what flags and passes -Oz and the other optimization levels
enable and use. Having done something similar for gcc, I was looking for a
similar approach.
However, I was not able to find many optimization flags for clang or llc,
which made me think that in clang/llvm the optimization changes are done
mostly by selecting/removing the opt passes(flags).

I'm unfamiliar with how gcc is structured; clang does not actually
run `opt` or `llc`. These three tools are all independent clients
of the optimization and code-generation libraries. Clang builds
its pipelines with comparatively little control from the command
line; opt and llc, which are intended to be testing tools for LLVM
developers, provide more control.

As a rule of thumb, `opt` runs IR optimization passes (aka the
"middle end"), while `llc` primarily runs lower-level "machine"
oriented passes (aka the "back end"). Both opt and llc will run
both target-independent and target-specific passes, so the set of
passes you will be looking at will be at least somewhat influenced
by which target you select.

In order to run what opt passes I select, I split the compilation process
in:

clang CLANG_FLAGS -emit-llvm mysource1.c -c -o mysource1.bc
opt OPT_FLAGS mysource1.bc -o mysource1.ll
llc LLC_FLAGS mysource1.ll -filetype=obj -o mysource1.o

That looks quite reasonable. In order to have Clang produce IR
that is optimizable, without running any optimizations itself, you
would want CLANG_FLAGS to include the following:
    -Xclang -disable-llvm-passes -Xclang -disable-O0-optnone
and then opt and llc can operate on the IR files as you would like.

I have also seen that for -Oz for example, the Pass Arguments appears
multiple time, does this mean that opt is run multiple times with different
passes options?

I'm not clear what you are asking about here. Note that clang does
not run opt, are you getting some kind of dump output from clang?

Now, my general direction questions are:

Am I on the right track with this?
Do you have any pointers or advice on this?

Well, what you're doing (with the tweaks mentioned above) is
probably how I would go about doing the same thing. Once you
have determined what a good -Oz pipeline looks like for your
examples and target, you could bring that back to the list as a
proposal for how Clang should build its -Oz pipeline. I'm sure
you will get plenty of feedback on the exact set of passes!

--paulr

I am looking at obtaining smaller code sizes for ARM cores. So far, I am
trying a genetic algorithm to find the best option for the opt passes and
clang/ llc flags.

Nice.

I am looking at what flags and passes -Oz and the other optimization levels
enable and use. Having done something similar for gcc, I was looking for a
similar approach.
However, I was not able to find many optimization flags for clang or llc,
which made me think that in clang/llvm the optimization changes are done
mostly by selecting/removing the opt passes(flags).

Most passes exposes some flags affecting their cost modeling and other heuristics. These are not intended to be exposed to clang users but you can play with them in opt, and a genetic algorithm would likely try to select these as well. For example here is one such option: https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/LoopRotation.cpp#L32

I’m unfamiliar with how gcc is structured; clang does not actually
run opt or llc. These three tools are all independent clients
of the optimization and code-generation libraries. Clang builds
its pipelines with comparatively little control from the command
line; opt and llc, which are intended to be testing tools for LLVM
developers, provide more control.

As a rule of thumb, opt runs IR optimization passes (aka the
“middle end”), while llc primarily runs lower-level “machine”
oriented passes (aka the “back end”). Both opt and llc will run
both target-independent and target-specific passes, so the set of
passes you will be looking at will be at least somewhat influenced
by which target you select.

In order to run what opt passes I select, I split the compilation process
in:

clang CLANG_FLAGS -emit-llvm mysource1.c -c -o mysource1.bc
opt OPT_FLAGS mysource1.bc -o mysource1.ll
llc LLC_FLAGS mysource1.ll -filetype=obj -o mysource1.o

That looks quite reasonable. In order to have Clang produce IR
that is optimizable, without running any optimizations itself, you
would want CLANG_FLAGS to include the following:
-Xclang -disable-llvm-passes -Xclang -disable-O0-optnone
and then opt and llc can operate on the IR files as you would like.

Actually I would likely replace -Xclang -disable-O0-optnone with -Oz: clang will insert an attribute on functions that will impact some pass heuristics to optimize for size.
(it is possible that -Oz for invoking llc will also change some heuristics in the backend)

I have also seen that for -Oz for example, the Pass Arguments appears
multiple time, does this mean that opt is run multiple times with different
passes options?

I’m not clear what you are asking about here. Note that clang does
not run opt, are you getting some kind of dump output from clang?

It may refer to the fact that there are two pass pipelines created by the PassManagerBuilder in Clang, so the debug output prints two invocations.
And opt does the same when used with the -Ox levels: https://github.com/llvm/llvm-project/blob/master/llvm/tools/opt/opt.cpp#L437-L438