The problem is that this *does* break existing code.
While I flagged the problem initially when I got ambiguities with '__fp16', this was simply due to the fact that every 6 months we align our out-of-tree implementation to the forthcoming formal numbered branch. In this case I was actually updating from v3.7.1 to v3.8.
It is our intention to start tracking head, but we're not there yet or I would have caught this a lot earlier in its development.
When I am stabilising after one of these big-bang updates, I first of all address all the compile-time regressions in our test-suites. The '__fp16' ambiguity simple revealed to me that ISO C names were now overloaded at the global namespace, even though '__fp16' is not meaningful to ISO C++.
But since then, I have been debugging runtime regressions, and these are far more difficult to debug.
A lot of the real-world application of our processor is math intensive, so the math libraries get hit on a lot. But performance is also critical and programmers trade-off using 'float' and 'double' quite often for a variety of reasons. Many of these math applications started off as FORTRAN and then were carefully ported and tuned to C. It is with great difficulty that we try to get programmers to use C++, and when they do so, is usually so that they can get better abstraction for their existing code (encapsulation and templates collections mostly).
The main reasons that programmers chose between 'float' and 'double' (assume 'float' is IEEE FP32 and 'double' is IEEE FP64) are:
o Precision - when the problem needs to be more precise, they use 'double'
o Dynamic Range - when the numbers are very large, they use 'double'
o Space - large data sets take up a lot of memory, and programmers often
choose to compromise precision for space by using 'float'
o Performance - on systems where FP32 arithmetic is more performant than
FP64 arithmetic, programmers will sometimes sacrifice the precision of
'double' for the performance of 'float'
I don't see a lot of programmers using FP128 'long double', not sure why not, but I guess that the loss of performance versus the gain in precision and dynamic range is not a viable trade-off for their programs. We also have extensive use of FP16 where the precision and dynamic range take a back seat to raw "good enough" performance and probably space too.
But I will illustrate the kind of RWC that is broken by overloading the ISO C names in the global space. The following is a reduction of a real example that regressed at runtime with this change, this is from a modelling library and is optimised for FP32. I have culled the actual code, and left just the control-flow logic:
// Inputs: float inputA; float inputB;
double upperRange = pow(inputA, inputB);
if (isnan(upperRange))
handleNanErrors(inputA, inputB);
else if (isinf(upperRange))
handleOutOfRangeErrors(inputA, inputB);
else if ((upperRange <= __FLT_MAX__) && (upperRange >= __FLT_MIN__))
useFP32OptimisedModel(upperRange, inputA, inputB);
else
useHighDynamicRangeModel(upperRange, inputA, inputB);
The observed problem is that the "useHighDynamicRangeModel" implementation was never being executed, and instead "handleOutOfRangeErrors" was being called far more often than previously.
When I analysed this, the problem is down to the overloading of '::pow'. With ISO C, the function 'pow' takes two 'double' arguments, and returns a value of type 'double'. The two input operands are 'float', so they are first promoted to 'double', and then 'pow(double,double)' is called, which yields a value of type 'double'.
With the overloading at the global namespace, the function '::pow(float,float)' is called, which in turn calls '::powf(float,float)' yielding a 'float' value which is then promoted to 'double' to initialise 'upperRange'.
But 'float' does not have the dynamic range of 'double', and many pairs of input operands of type 'float' yield results that exceed the dynamic range of 'float' and 'pow(float,float)' returns INFINITY which is retained on promotion to 'double'. This in turn make 'isinf(upperRange)' evaluate to true when the value exceeds the dynamic range of FP32 and not when it exceeds the dynamic range of FP64 as was intended and expected by the program.
The original code was complaint ISO C, that was then migrated to C++, but as with the majority of legacy C programs, the changes made to operate with C++ are very minimal.
You could say "just put a cast to 'double' before the input operands" - and this will of course work in this case. But it took me a few hours to find this problem buried as it was in the actual code, and of course I was blaming our target lowering as the probable cause until I realised that it was not a code-generation bug. And I expect that related problems are going to be liberally strewn throughout legacy C code that is migrated or partially migrated to C++ in ways that will be very difficult to detect.
In my opinion, this was never the intent of the C++ Standard Committee, and I think that it is also an undesirable interpretation of the words in the Standard that will only server to further alienate C programmers from using C++; and this is already a difficult task in the embedded programming space where the majority of C programmers already hear enough FUD about C++ and performance.
My attachment of a patch at this stage is because I think that this is a significant enough issue for LibC++ v3.8 that it warrants urgent attention. I am also aware that it will probably not get resolved unless somebody does some work to make it happen, and I have done some of that work which may serve as a starting point for others to complete (LibC++ maintainers). I am not experiencing failures in the LibC++ test-suite from these changes with the exception of the 'depr.c.headers' test that I already mentioned and I have also not seen any failures in modules.
And these changes have addressed the regressions in the RWC that I use for verification.
While I admit the changes are not the ideal solution, I think that they represent an adequate minimalistic change to the sources that will offset the majority of the problems, but I am already working on a rewrite of '<cmath>/<math.h>' that I think will produce a better solution that addresses the objectives of C++ while retaining the intentions of ISO C compatibility. And of course the other ISO C wrapper headers. But I won't have this done in time for v3.8 code freeze.
All the best,
MartinO