Introduction.
Floating point computations have several mathematical properties that may result in producing an unexpected result without “doing anything wrong”. Examples include loss of precision, accumulated rounding errors, numerical artifacts related to special values (NaNs, infinities).
The RFC ⚙ D97854 [RFC][nsan] A Floating-point numerical sanitizer. (Clement Courbet) introduced a relatively simple compiler-based instrumentation ([2102.12782] NSan: A Floating-Point Numerical Sanitizer) for automatic tracking of precision based on shadowing the computations using higher precision types. The technique has been demonstrated on several examples / tests.
Example
Consider the classical summation problem:#include <random>
#include <vector>
#include <stdio.h>
// Naive summation.
template <typename T>
T NaiveSum(const std::vector<T>& values) {
T sum = 0;
for (T v : values) {
sum += v;
}
return sum;
}
// Kahan's summation
// https://en.wikipedia.org/wiki/Kahan_summation_algorithm
template <typename T>
T KahanSum(const std::vector<T>& values) {
T sum = 0;
T c = 0;
for (T v : values) {
T y = v - c;
T t = sum + y;
c = (t - sum) - y;
sum = t;
}
return sum;
}
int main() {
using FLT = float;
std::vector<FLT> values;
constexpr const int kNumValues = 1000000;
values.reserve(kNumValues);
// Using a seed to avoid flakiness.
constexpr uint32_t kSeed = 0x123456;
std::mt19937 gen(kSeed);
std::uniform_real_distribution<FLT> dis(0.0f, 1000.0f);
for (int i = 0; i < kNumValues; ++i) {
values.push_back(dis(gen));
}
const auto trueSum = KahanSum(values);
printf("true sum: %.8f\n", trueSum);
const auto sum = NaiveSum(values);
printf("sum: %.8f\n", sum);
return 0;
}
WARNING: NumericalStabilitySanitizer: inconsistent shadow results while checking return value
float precision (native): dec: 500093664.00000000000000000000 hex: 0x1.dced2e00000000000000p+28
double precision (shadow): dec: 500119719.80826514959335327148 hex: 0x1.dcf38a7ceea770000000p+28
shadow truncated to float : dec: 500119719.80826514959335327148 hex: 0x1.dcf38a7ceea770000000p+28
Relative error: 0.00520991419317334923% (2^9 epsilons)
Absolute error: 0x1.971f3ba9dc0000000000p+14
(814 ULPs == 2.9 digits == 9.7 bits)
#0 0x55ab59fe0b48 in float NaiveSum<float>(std::vector<float, std::allocator<float> > const&) (
llvm-project/build/sum.exe+0x3cb48)
#1 0x55ab59fdf915 in main (
llvm-project/build/sum.exe+0x3b915)
#2 0x7f0db56216c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#3 0x7f0db5621784 in __libc_start_main csu/../csu/libc-start.c:360:3
#4 0x55ab59fad5c0 in _start
Proposition.
https://github.com/llvm/llvm-project/pull/85916 reintroduces ⚙ D97854 [RFC][nsan] A Floating-point numerical sanitizer. largely following the practices for the existing sanitizers. The list of applications of this tool includes testing of specialized software dealing with numerical algorithms and debugging numerical computations. Potentially this also might be helpful for tracing portability issues between different platforms.
Implementation.
The implementation naturally follows the existing framework for sanitizers and includes
- LLVM: instrumentation pass
- compiler-rt: new helper routines that perform checks / expose shadow values.
- Clang: minimal codegen changes to add a function attribute and minimal driver changes.
Potential extensions.
Similarly to the thread sanitizer the proposed instrumentation is not specific to C/C++ and can be integrated with other llvm-based toolchains (e.g. Flang, Swift, Rust).
cc: @vitalybuka , @legrosbuffle, @echristo, @arsenm, @andykaylor, @efriedma-quic