Hi Richard,
I think you're seeing a change in the backend (in combination with an older frontend change) -- perhaps the backend change is when we added fast-math flags to fcmp?
Here's what happens:
Given some code like this:
$ cat /tmp/c.c
typedef _Complex double dc;
dc foo(dc a, dc b) {
return a*b;
}
Compiling it produces IR that looks like this:
define { double, double } @foo(double %a.coerce0, double %a.coerce1, double %b.coerce0, double %b.coerce1) #0 {
entry:
...
[perform the fast code]
%isnan_cmp = fcmp fast uno double %mul_r, %mul_r
br i1 %isnan_cmp, label %complex_mul_imag_nan, label %complex_mul_cont, !prof !1
complex_mul_imag_nan: ; preds = %entry
%isnan_cmp1 = fcmp fast uno double %mul_i, %mul_i
br i1 %isnan_cmp1, label %complex_mul_libcall, label %complex_mul_cont, !prof !1
complex_mul_libcall: ; preds = %complex_mul_imag_nan
%call = call { double, double } @__muldc3(double %a.real, double %a.imag, double %b.real, double %b.imag) #1
%4 = extractvalue { double, double } %call, 0
%5 = extractvalue { double, double } %call, 1
br label %complex_mul_cont
complex_mul_cont: ; preds = %complex_mul_libcall, %complex_mul_imag_nan,
...
So we always do the fast calculation, and then only if we get NaN, do we go back and do the slow calculation. But because of the fast-math flags, the backend can constant fold the relevant comparisons, and eliminate that entire set of branches, leaving on the fast code.
-Hal