Context
llvm-libc
is intended to support a wide range of scenarios from embedded devices where resources are scarce to demanding applications running on high-end servers where performance is paramount. The philosophy behind llvm-libc
is that you can tailor it to your needs.
This document explores how the tuning can be implemented, what are the pitfalls, and how to mitigate them.
This post is really to get the conversation started around the topic of “tuning llvm libc”. Please don’t hesitate to share your thoughts / advice as this will help shape our guidelines.
1. Build Systems
llvm-libc
currently supports building via CMake
and Bazel
. Customizing the build involves passing specific flags to the build system command line.
CMake
For CMake
this is done by using the -D
command line option. Project maintainers can add custom options with the option
keyword. Some options prefixed by CMAKE_
have a special meaning and are targeted to CMake
itself (e.g. -DCMAKE_BUILD_TYPE=Release
).
To communicate these values down to the compiler, the maintainer can either explicitly add preprocessor definitions (target_compile_definitions) or auto-generate a header file containing the options they want to export (configure_file).
Bazel
The user can specify custom command line options (e.g., --@llvm-project//libc:mpfr=system
). The presence of these flags can activate config_settings that are in turn used to modify the compilation process. The select keyword can be used to customize the target and change compilation flags, preprocessor definitions, target dependencies, etc…
2. The build options themselves
2.1 Declaration and naming
Build options have to be declared in the build configuration file.
The options should be scoped and named consistently. The relevant scope will usually be the C
function name (e.g., printf
) but if a feature is shared across a range of functions the scope can be a name precisely identifying this group (e.g. SINE_COSINE
or MATH_TRIG
)
-
For
CMake
, feature configuration would be done by using the following templateLIBC_<FUNCTION>__<FEATURE>__<VERB>
e.g.LIBC_PRINTF__INDEX_MODE__DISABLE
-
For
Bazel
a similar scheme can be used.
e.g.,--@llvm-project//libc:printf__index_mode=disable
2.2 Orthogonality
Enabling a feature should not disable another one or lead to surprising behaviors. If two features are incompatible with each other, the compilation should stop early with a clear error.
3. The Tuning
Tuning itself can be done at the build system level or within the compiler at the source code level.
3.1 At build system level
Based on user provided options, the build system can modify the compiler command line option to:
- change implementations by picking another source file,
- enable / disable target specific features (e.g.
-mno-fma
disables support for FMA), - change optimization profile (for speed, for size),
- add preprocessor definition and delegate customization to the compiler (See 3.2 below).
This is powerful but difficult to keep in sync between CMake
and Bazel
.
3.2 At source code level via the preprocessor
In section 2 above we focused on build system options, in this section we are talking about compiler preprocessor definitions. Although CMake
command line uses the -D
syntax to set build options these are not to be confused with clang -D
syntax.
To prevent conflating the two we suggest that preprocessor definitions start with LLVMLIBC_
instead of LIBC_
.
A note on interactions between compiler flags and preprocessor
If some preprocessor definitions are provided by the build system, other ones are set because of compiler options.
For instance, if the build system adds -fma
on the clang command line, clang will automatically define the __FMA__
preprocessor definition. Similarly, using the -DCMAKE_BUILD_TYPE=MinSizeRel
CMake
option, will append the -Os
flag that, in turn, will define the __OPTIMIZE_SIZE__
preprocessor definition.
We can arrange these compiler generated preprocessor definitions in two categories:
- The semantic of the flag precisely matches the semantic for
llvm-libc
(e.g.__FMA__
means “target cpu supports fma instructions”) - The semantic of the flag is imprecise or does not fully represent the intent of the
llvm-libc
option (e.g.__OPTIMIZE_SIZE__
does not discriminate between scenarios like “optimize for size at all cost” and “optimize for cost with reasonable speed”).
If the semantic is perfectly represented by the compiler generated preprocessor definition we can use it to perform conditional compilation. If not, the build system is responsible for setting additional preprocessor definitions with precise meaning and the conditional compilation should use these instead of the imprecise one.
A note on pitfalls of preprocessor
Conditional compilation based on preprocessor definitions is quite standard
#ifdef LLVMLIBC_ABC
// This branch is compiled when LLVMLIBC_ABC is defined.
#else
// This branch is compiled when LLVMLIBC_ABC is undefined.
#endif
Unfortunately this can be brittle and hard to maintain. For instance, forgetting to rename one instance when refactoring or making a typo can lead to the wrong branch being compiled
// LLVMLIBC_ABC renamed to LLVMLIBC_XXX in the codebase but this instance was forgotten.
#ifdef LLVMLIBC_ABC
// This branch is not compiled anymore as LLVMLIBC_ABC is undefined.
#else
// This code is compiled instead and may compile fine.
#endif
Also, as far as the preprocessor is concerned, an undefined preprocessor definition compares equal to "0"
(code).
e.g., The following code compiles just fine but the behavior is unexpected.
#if LLVMLIBC_FOO==0
// Taken even if LLVMLIBC_FOO is not set on the compiler's command line.
#elif LLVMLIBC_FOO==1
// Not taken as expected.
#endif
The fact that the preprocessor runs before compilation makes it difficult to mitigate these problems.
- Solution 1: Mitigating with preprocessor only
Preprocessor definition checking within the preprocessor is quite verbose and really adds visual clutter.
#if defined(LLVMLIBC_PRINTF_DISABLE_INDEX_MODE) && (LLVMLIBC_PRINTF_DISABLE_INDEX_MODE == 0)
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is defined to 0
#elif defined(LLVMLIBC_PRINTF_DISABLE_INDEX_MODE) && (LLVMLIBC_PRINTF_DISABLE_INDEX_MODE == 1)
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is defined to 1
#else
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is undefined
#endif
- Solution 2: Mitigating with preprocessor only, two-steps solution
#ifndef LLVMLIBC_PRINTF_DISABLE_INDEX_MODE
#error "LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is undefined"
#endif
#if LLVMLIBC_PRINTF_DISABLE_INDEX_MODE == 0
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is defined to 0
#elif LLVMLIBC_PRINTF_DISABLE_INDEX_MODE == 1
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is defined to 1
#else
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is undefined
#endif
- Solution 3: Mitigating with a mix of preprocessor and C++
We can define a consteval
function to check that a string is exactly "0"
or "1"
and then evaluate the stringize version of the preprocessor definition (code).
LIBC_VALIDATE_BOOL_ENV(LLVMLIBC_PRINTF_DISABLE_INDEX_MODE);
#if LLVMLIBC_PRINTF_DISABLE_INDEX_MODE
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is defined to 1
#else
// LLVMLIBC_PRINTF_DISABLE_INDEX_MODE is defined to 0
#endif
Same for an integer preprocessor definition
LIBC_VALIDATE_UINT_ENV(LLVMLIBC_FILE_BUFFER_SIZE);
Note: there is still the possibility of a typo on the #if
line but we expect that modern editor’s syntax highlighting will help catch such errors. e.g.,
LIBC_VALIDATE_BOOL_ENV(ABC);
#if ABCD
...
This is the preferred solution as the check stays next to the usage. It also reduces visual clutter to a minimum. The only caveat here is that it requires C++20
.
3.2.1 Setting constants
A preprocessor definition can be used to define constant libc quantities like thread stack size or file buffer size. e.g.,
clang -DLLVMLIBC_THREAD__STACK_SIZE=4096
The developer should use the LIBC_VALIDATE_UINT_ENV
macro to make sure that the preprocessor definition is set and valid before using it.
3.2.2 Conditional code
The intent here is to allow some features to be disabled or some implementations to be replaced by alternatives by using the preprocessor directives (#if
, #else
, #endif
, etc… ). This is best done by using boolean preprocessor definitions.
The developer should use the LIBC_VALIDATE_BOOL_ENV
macro to make sure that the preprocessor definition is set and valid.
3.2.3 Conditional file inclusion
There are several ways of performing conditional file inclusion using a combination of build system and preprocessor directives.
Is it unclear which stands out so we list them here in arbitrary order with their pros and cons.
-
Selection with the build system
The build system language is leveraged to select the file to compile for a certain set of constraints. It seems like the appropriate approach for completely different implementations where there are no obvious customization points and a straightforward selection logic.
e.g. Pick betweengeneric/sqrt.cpp
orx86_64/sqrt.cpp
oraarch64/sqrt.cpp
depending on target architecture.Pros
- No
#if
,#else
,#endif
in the source file - The dispatch logic can use high level build options that are not visible at source code level.
Cons
- The same selection logic should be replicated and kept in sync between the build systems.
- The selection logic is not visible in the source code, this raises the bar for contributors and maintainers who would also need to understand both build systems.
- No
-
Selection with conditional include
Here the preprocessor will pull the relevant code for the compiler based on preprocessor definitions.
LIBC_VALIDATE_BOOL_ENV(LLVMLIBC_MATH_TRIG__SMALL_AND_IMPRECISE) #if LLVMLIBC_MATH_TRIG__SMALL_AND_IMPRECISE #include "src/math/trig_precise.inl" #else #include "src/math/trig_imprecise.inl" #endif
Pros
- The selection logic is visible in the source code.
Cons
- For build hermeticity, the build system may need to know about the file dependencies, if so the logic will have to be replicated in the build system.
-
Selection with conditional code
Another possibility is to include all alternatives in the main source file and do the selection using the preprocessor at the implementation site.
In the main source file, all alternatives are listed.
#include "src/math/trig_imprecise.inl" #include "src/math/trig_precise.inl"
In the implementation file the content is selectively enabled / disabled
Content of
src/math/trig_imprecise.inl
#include "src/__support/common.h" // LIBC_VALIDATE_BOOL_ENV LIBC_VALIDATE_BOOL_ENV(LLVMLIBC_MATH_TRIG__SMALL_AND_IMPRECISE) #if LLVMLIBC_MATH_TRIG__SMALL_AND_IMPRECISE==1 // Implementation of small but imprecise trigonometric functions goes here #endif
Content of
src/math/trig_precise.inl
#include "src/__support/common.h" // LIBC_VALIDATE_BOOL_ENV LIBC_VALIDATE_BOOL_ENV(LLVMLIBC_MATH_TRIG__SMALL_AND_IMPRECISE) #if LLVMLIBC_MATH_TRIG__SMALL_AND_IMPRECISE==0 // Implementation of correct trigonometric functions goes here #endif
Pros
- All logic is in the code and all implementations are listed in the build system regardless of the selection mechanism.
Cons
- The selection logic is spread amongst implementations, which is hard to read and maintain.
Thank you for reading so far. Please let me know what you think.