[PATCH 1/1] math: Don't use llvm intrinsic for pow

the intrinsic does not work for fp64
amdgpu targets expand the fp32 intrinsic into exp2(mul(log2)) anyway.

Fixes crash in pow(double, double).
fp32 version still hits the same precision failures in CTS as the intrinsic
implementation.

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

the intrinsic does not work for fp64
amdgpu targets expand the fp32 intrinsic into exp2(mul(log2)) anyway.

v2: drop leftover development code

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

the intrinsic does not work for fp64
amdgpu targets expand the fp32 intrinsic into exp2(mul(log2)) anyway.

v2: drop leftover development code
v3: enable cl_khr_fp64 in gentype.inc

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

the intrinsic does not work for fp64
amdgpu targets expand the fp32 intrinsic into exp2(mul(log2)) anyway.

v2: drop leftover development code
v3: enable cl_khr_fp64 in gentype.inc

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>

Looks ok to me. I verified that the same list of inputs fail, although
now, we tend to generate positive NaN instead of negative NaN for
these invalid results... But given that they're invalid anyway, I
can't say that this is a bad thing.

I verified that the double version generates code that doesn't crash
out llvm, and there's roughly the same number of test failures in the
double variant due to invalid results (probably the same corner cases
being tested for double precision inputs).

--Aaron