Clang not generating pow_finite with -ffast-math


I am trying to make clang generate code similar to gcc for the following function with –ffast-math option.
#include <cmath>
double foo(double val, double i) {
     double t = log(exp(val));
     return pow(t, i);

There are a few issues here.

· Firstly, clang is not optimizing log(exp(val)) away into val.

· Secondly, clang is calling an choosing an intrinsic of pow, instead of a library call to @__pow_finite

  %0 = tail call double @llvm.pow.f64(double %call1, double %i)

I am wondering if there is a performance benefit in using @llvm.pow.f64, because at the end of the day, llvm generates code for x86 as follows:
               .cfi_def_cfa_offset 16
               movsd %xmm1, (%rsp) # 8-byte Spill
               callq __exp_finite
               callq __log_finite
               movsd (%rsp), %xmm1 # 8-byte Reload
               popq %rax
               jmp pow

I believe __pow_finite is faster than pow. Please correct me if I am wrong.



I don't know why we do the pow intrinsic replacement; the semantics of the intrinsic are defined to be identical to the libm function call. It might just be something that has been that way for a long time, and now we should do something else.

This also brings up another issue, on the LLVM side of things, should be default the library-function intrinsic expansions to the _finite versions in fast-math mode?