`pow(x, -1.0)` => `1/x` strength-reduction is enabled by default, but doesn't round the same

It appears that SimplifyLibCalls unconditionally replaces pow(x, -1.0) with 1/x:

  if (match(Expo, m_SpecificFP(-1.0)))
    return B.CreateFDiv(ConstantFP::get(Ty, 1.0), Base, "reciprocal");

But libm’s pow does not necessarily round the same as 1/x. Here’s one example out of many:

>>> import math
>>> x = 262145.0
>>> math.pow(x, -1.0)
3.8146827137652824e-06
>>> 1/x
3.8146827137652828e-06

The langref’s semantics for pow are

Return the same value as a corresponding libm ‘pow’ function but without trapping or setting errno.

so it seems to me that introducing even a 1ulp difference is not legal if the afn flag is not set. Is this correct or am I missing something?

This probably should have a flag. I’m assuming the fdiv is more correct (in which case you could arguably do this with contract?). The area where the specification breaks down the most is defining what the “libm pow” function really means. We inconsistently care about exact host platform differences. The llvm-libc implementation is supposed to be correctly rounded. What answer does that give for your sample?

One thing llvm-libc might want to consider is explicitly giving pow a branch for integer exponents (that turns it into powi). Julia does this and it’s a pretty nice performance improvement (there are a lot of cases where x^p is actually just x^2 even if the compiler can’t prove it).

I’m assuming the fdiv is more correct (in which case you could arguably do this with contract?).

I also assume fdiv is more correct. However, at least in the ML space, silently increasing the precision of operations is problematic. I guess in this case it’s not nearly as bad because (a) this is presumably a 1ULP difference, way less than e.g. silently increasing f16 → f32, and (b) different libm’s (may) have different behavior so it’s not like there was a “right” answer in the first place.

Still, seems sus that you could have a situation where y=-1.0 but pow(x, -1) != pow(x, y).

The llvm-libc implementation is supposed to be correctly rounded. What answer does that give for your sample?

I tried to build llvm-libc on my mac and ran into a bunch of problems that codex couldn’t immediately work around. Is it supported? I don’t have a Linux machine handy.

If someone wants to test it here’s a reproducer that codex wrote for me.


#include <inttypes.h>
#include <math.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>

static float float_from_bits(uint32_t bits) {
  float f;
  memcpy(&f, &bits, sizeof(f));
  return f;
}

static uint32_t bits_from_float(float f) {
  uint32_t bits;
  memcpy(&bits, &f, sizeof(bits));
  return bits;
}

__attribute__((noinline)) static float libm_powf(float x, float y) {
  // Call through a volatile function pointer to prevent builtin folding.
  static float (*volatile powf_ptr)(float, float) = powf;
  return powf_ptr(x, y);
}

int main(void) {
  uint64_t total = 0;
  uint64_t mismatches = 0;
  uint64_t nan_mismatches = 0;
  uint64_t reported = 0;
  const uint64_t max_report = 10;
  const uint32_t progress_mask = (1u << 28) - 1; // Report every 2^28 inputs.

  for (uint32_t bits = 0;; ++bits) {
    float x = float_from_bits(bits);
    float p = libm_powf(x, -1.0f);
    float r = 1.0f / x;
    uint32_t pbits = bits_from_float(p);
    uint32_t rbits = bits_from_float(r);

    if (pbits != rbits) {
      ++mismatches;
      if (isnan(p) && isnan(r))
        ++nan_mismatches;
      if (reported < max_report) {
        printf("mismatch[%llu] bits=0x%08x x=%a powf=%a 1/x=%a\n",
               (unsigned long long)reported, bits, x, p, r);
        ++reported;
      }
    }

    ++total;
    if ((bits & progress_mask) == 0 && bits != 0) {
      printf("progress: bits=0x%08x total=%" PRIu64
             " mismatches=%" PRIu64 "\n",
             bits, total, mismatches);
      fflush(stdout);
    }
    if (bits == UINT32_MAX)
      break;
  }

  printf("Total floats: %" PRIu64 "\n", total);
  printf("Mismatches:  %" PRIu64 "\n", mismatches);
  printf("NaN mismatches: %" PRIu64 "\n", nan_mismatches);
  return 0;
}