and compiling it with “clang –O3 …”, I was trying to determine what it would take to get the X86 code generator to replace the call to sqrt with a sqrtsd instruction inline.
It turns out that it could do exactly that, were it not for the fact that in the function visitUnaryFloatCall() at line 5514 in SelectionDAGBuilder.cpp, the result of
!I.onlyReadsMemory()
Is true, so the code is unable to replace the function call with an ISD::FSQRT SDNode. If I remove the above test, then the compiler will emit a sqrtsd instruction.
I am hoping that someone might be able to comment on what onlyReadsMemory is supposed to do be doing in general and why it is returning false in this case.
Yes, in both GCC and Clang. Clang does have some annoying logic bugs
surrounding this flag though. For example, setting -fno-fast-math would
imply no-math-errno, overriding the Linux default. Quite weird. I've
cleaned this up some and added more clear tests in r182203.
However, there still seems to be a problem in that if you pass –ffast-math to clang, then clang changes “sqrt” to be “__sqrt_finite”. LLVM cannot then change the function call into an x86 sqrt instruction, even with –fno-math-errno set.
Can you suggest where I might look in the clang code to find the place where “sqrt” is converted to “__sqrt_finite” and/or the best way to solve this problem?
However, there still seems to be a problem in that if you pass –ffast-math to clang, then clang changes “sqrt” to be “__sqrt_finite”. LLVM cannot then change the function call into an x86 sqrt instruction, even with –fno-math-errno set.
Can you suggest where I might look in the clang code to find the place where “sqrt” is converted to “__sqrt_finite” and/or the best way to solve this problem?
This sounds like your system headers are trying to outsmart the compiler, clang doesn't generate calls to __sqrt_finite anywhere. We may have to recognize the pattern in LLVM or clang if we want to inline calls to sqrt. A first step would be to figure out where the headers are doing this and whether there's a way to disable it.
If I do not use the -ffast-math, then the generated code calls "sqrt". If I do use -ffast-math, then the code calls __sqrt_finite.
The use of -ffast-math seems to result in gcc's math.h including /usr/include/x86_64-linux-gnu/bits/math-finite.h, which (I am guessing!) redfines sqrt as "__sqrt_finite".