Why the output is different for this below program when compiled using clang with fast-math optimization
#include<stdio.h>
int main() {
double d = 1.0;
double max = 1.79769e+308;
d /= max;
printf(“d:%e:\n”, d);
d *= max;
printf(“d:%e:\n”, d);
return 0;
}
prints 0 with fast math but 1 without fast math.
Please do not ask llvm-dev to do your homework. If this is genuinely not a school assignment, reply to me off-list and I’ll help you understand what’s happening here.
– Steve
This is of course not homework. I am trying to understand how fast math optimizations work in llvm. When I compared IR for both the programs, the only thing I have noticed is that fdiv and fmul are replaced with fdiv fast and fmul fast. Not sure what happens in fdiv fast and fmul fast.
I feel that its because d/max is really small number and fast-math does not care about small numbers and consider them to zero but this is so incorrect.
[llvm-dev to BCC]
What transformations are licensed by fast-math? Specifically, what is LLVM allowed to do with fdiv fast?
– Steve
Hi,
I feel that its because d/max is really small number and fast-math does not care about small numbers and consider them to zero but this is so incorrect.
Which version of Clang are you using? I see 1.0 in all cases I've
tested, which is not terribly surprising. There isn't really much for
fast-math to latch onto in the example, all the compiler might do is
evaluate the constant result, and there's no reason to do it
imprecisely.
Cheers.
Tim.
$ clang -v
clang version 7.0.0 (trunk 336308)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ cat fmath.c
#include<stdio.h>
int main() {
double d = 1.0;
double max = 1.79769e+308;
d /= max;
d *= max;
printf(“d:%e:\n”, d);
return 0;
}
$ clang fmath.c
$ ./a.out
d:1.000000e+00:
$ clang -ffast-math fmath.c
$ ./a.out
d:0.000000e+00:
The generated code doesn't actually change between the two variants in this case. The actual difference here happens during the link step.
Specifying "-ffast-math" on the clang invocation that spawns the linker causes "crtfastmath.o" to get linked (Clang ToolChain::AddFastMathRuntimeIfAvailable).
crtfastmath.o puts the CPU in FTZ+DAZ mode (subnormals flushed to zero, subnormals are treated as zero).
1.0 / max in your program is a subnormal value, which hence gets flushed to zero.
If you don't want this to happen, don't specify -ffast-math on the clang invocation that calls the linker:
[fabiang@fabiang-pc-i7 flttest]$ clang -ffast-math fmath.c && ./a.out
d:0.000000e+00:
[fabiang@fabiang-pc-i7 flttest]$ clang -c -ffast-math fmath.c -o fmath.o && clang fmath.o && ./a.out
d:1.000000e+00:
but more generally, if you want special values such as infinities, NaNs, subnormals and underflow/overflow conditions to be handled precisely (with full IEEE compliance), _don't use -ffast-math_.
You're explicitly opting out of compliance here, and yes, sometimes the differences are substantial.
-Fabian