Documentation of flags affecting the performance of generated binaries


in 20543 [0] I requested documentation about the clang front-end flags that might affect the program performance since these are not documented in the docs [1]. Following a request in [0] I would like to start a discussion about:

What to include and exclude from such a documentation page.

Once this is decided, that documentation can be written and submitted for further discussion and revision. This documentation could go in [1] under “Language and Target-Independent Features” in the subsection “Controlling code generation” or in a different subsection (options like -g which might affect the performance of a generated binary are already documented in their own subsections here).

Documentation that is currently missing in [1] and that in my opinion is relevant:

(all this points are debatable and some debate has already happened in bugzilla, this is just the best starting point for a discussion that I could come up with)

  1. Optimization levels: What optimization levels are there (O0-4, Ofast, Os, …)? What is their intended usage? Typically they are equivalent to passing a series of command line arguments to the compiler, which arguments does each optimization level passes?

  2. Description of the command line arguments that might affect the performance of the generated binary such as -flto, -fstrict-aliasing, -ffast-math, -fvectorize, -fvectorize-aggressive as well as those passed by the optimization levels. In particular:

  • What does each of these commands do?
  • Are they enabled by an optimization level? If so, by which ones?

I would like to also have documentation here for commands that can affect performance but don’t affect the optimizer per se such as -fassume-sane-operator-new or even -DNDEBUG (debatable but it is a macro of special significance that controls, among others, the behavior of , although it has nothing to do with the compiler itself).

  1. Controlling target-specific code generation: -march=native/-mtune=native and related options. I don’t know much about these and can’t find documentation about them anywhere in the clang docs.

  2. A mention of the existence of the LLVM-Opt with link to their documentation with a description about how to pass options to it directly and a warning that doing this is not recommended and that this options are not officially supported by clang and can change anytime. It is debatable wether some of these deserve special mention, e.g., sadly -mllvm -inline-threshold=value can have a huge impact on performance.

  3. Document those “performance-related” flags that do nothing but pass silently to achieve, e.g., gcc/msvsc compatibility.