pow operator on Windows

I have a very simple test case on Windows that shows some surprising behavior. This doesn't seem to be a problem on Linux.

The example is:
#include <stdio.h>
#include <math.h>
double heat(double Pr) {
  return pow(Pr, 0.33);
}
int main(int argc, char **argv) {
  double Nu = heat(291.00606180486119);
  printf("%.20f\n", Nu);
}

I've tested with MinGW's gcc.exe 3.4.5 and the Visual Studio 2008 C++ compiler, and LLVM 2.7 and 2.8.
With "gcc test.c; ./a.exe", the printed result is 6.50260946378542390000
With "clang -emit-llvm -c test.c -o test.bc; lli test.bc", the printed result is 6.50260946378542480000

The difference in the last 2 digits is significant in the scientific software I'm working on. Does anyone have an explanation for why they differ, or how to make them behave the same?

Thanks,
Michael Smith

Both answers are within one double-precision ulp of the true answer,
which is roughly 6.5026094637854243. So both answers are acceptable
return values for pow(); normal implementations do not guarantee
half-ulp accuracy for transcendental functions like pow(). My guess
is that lli and the gcc-compiled program are using different versions
of the C runtime library, and therefore different pow()
implementations. If you really care about floating-point calculations
being precisely reproducible across platforms, I would suggest using a
library like mpfr instead of the compiler's floating-point
implementation.

-Eli

[...]

With "gcc test.c; ./a.exe", the printed result is 6.50260946378542390000
With "clang -emit-llvm -c test.c -o test.bc; lli test.bc", the printed result is 6.50260946378542480000

The x86 has got several different FPU instruction sets, which don't all
work at the same precision. In particular, using 387 instructions uses
80-bit floats when working with temporaries, while SSE instructions use
64-bit floats.

gcc's default choice is 387, so if clang is defaulting to SSE (which
most compiler software these days does because 387 instructions are
horrible), this might explain the results you get.

You might want to look at the generated machine code to see how they
differ. If this *is* the problem, you can tell gcc to use a particular
instruction set with -mfpmath=386 or -mfpmath=sse.

...

You might want to look at the generated machine code to see how they
differ. If this *is* the problem, you can tell gcc to use a particular
instruction set with -mfpmath=386 or -mfpmath=sse.

I think you mean -mfpmath=387, instead. :slight_smile:

Btw, this option is also not supported by clang... any idea how it could
be implemented, if at all?

Shouldn't be that hard for 32-bit x86 since -mattr=-sse already works
when passed to llc (but not clang) so you should be able to just copy
that code.

For x86-64 it's trickier, as the calling convention uses SSE registers
so -mattr=-sse ICEs when returning floats etc. ("LLVM ERROR: SSE
register return with SSE disabled").
This would need either
a) a more selective option to disable SSE *math* only (but allow use
of SSE registers for parameters and return values), or
b) change the calling convention to use x87 registers instead. This
which would require recompiling anything that accepts or returns
floating-point numbers, including printf(), sqrt() and friends, etc.

This doesn't look related to llc's instructions. When I generate assembly using either llc or clang, they both use _pow; if I then compiled the assembly with gcc the results are identical to compiling the C source with gcc.

So Eli's response that lli is using a different library than gcc and Visual Studio looks correct. The only option then is to use another library instead of llvm.pow.f64 - for my case this is easy because my actual use is generating LLVM bytecode directly rather than using clang.