Optimize away sqrt in simple cases?

hi!

I’m using LLVM 3.2. Does it optimize away llvm.pow(llvm.sqrt(x), 2) to x with any settings? I tried with llc -O3, but that didn’t do it.

Would be nice to write |v|² in my language (‘v’ being a 2D vector say and |…| and …² being two separate infix “operators”) – when I could compare squares of lengths as well as lengths, and know that the sqrt is optimized away.

Erkki

That's a pretty seriously unsafe floating point optimization. It could be done in fast-math mode, but I doubt we currently do it.

--Owen

Hi Erkki,

hi!

I'm using LLVM 3.2. Does it optimize away llvm.pow(llvm.sqrt(x), 2) to `x` with
any settings? I tried with llc -O3, but that didn't do it.

the most powerful optimizations are done by the "opt" tool, not llc. That said,
I don't think any pass does this. It would at least require "fast math".

Ciao, Duncan.

Hello,

That's a pretty seriously unsafe floating point optimization. It could be
done in fast-math mode, but I doubt we currently do it.

I just saw this thread and wonder why it's "seriously" unsafe? I see only one
problematic corner case.

For x >= 0.0 the result cannot lose precision.
For x = NaN the result is the same. sqrt(NaN) = NaN; pow(NaN,2) = NaN
For x = +inf the result is the same. sqrt(inf) = inf; pow(inf,2) = inf
For negative numbers llvm.sqrt(x) by definition has undefined behavior and so
returning x from llvm.pow(llvm.sqrt(x),2) is acceptable.
The same should be true for -inf.

Only the result for -0.0 changes from 0.0 to -0.0.

Is there something I overlooked?

--Owen

Christoph

This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.

--Owen

This is not true. The mathematically correct result for sqrt might not be a representable value in floating point, so rounding may occur between the two steps. In that case, pow2(sqrt(x)) != x.

--Owen

I think what Christoph is saying is that x will always be at least as
accurate as pow2(sqrt(x)), so it's only unsafe in so far as one's code
is actually depending on an imprecise result.

It's not an "as-if" optimization though, so it definitely would
require "fast math" or something similar; however, it seems like it
would be on the more benign side among optimizations that apply
mathematical identities that are not necessarily identities on floats,
as far as I can tell...

-Stephen

Giving more-than-expected precision can be just as bad for the user as less. It tends to come up in situations where the optimization would break some symmetry, the same way that aggressively forming FMAs can break user code. Consider this example:

float foo(float a, float b) {
  return pow2(a) - pow2(sqrt(b));
}

float bar(float c) {
  return foo(sqrt(b), b);
}

The author *should* be able to assume that for any positive c, the only possible output values are Inf and zero. However, if we apply the pow2/sqrt peephole, suddenly non-zero finite outputs are possible. The equivalent example for FMA formation is x*x - x*x. If we convert that to "fma x, x, (-x*x)", you can get a non-zero finite result.

It boils down to the fact that giving excess precision in some-places-but-not-others can lead to bad behavior.

--Owen

OK, that makes sense--just clarifying what I thought Christoph meant.
In any case, maybe it ought to be at least an option for "fast math"
or in some other setting where mathematical identities that may gain
or lose precision are taken advantage of. (I know GCC's version of
"fast math" allows lots of things that are potentially more unsafe
than this; I'm not familiar of the state of clang's equivalent).

Stephen

That’s a common misconception. Floating point computations are deterministic and a function only of their inputs (and a couple of modal switches like rounding mode).

–Owen

Not sure exactly what the C/C++ standard says here but usually it's allowed to give more precision.

It's also allowed for math to work as real math and not as limited precision math.

If your code is requiring otherwise, then therein is the problem.

Can't arcsin(sqrt(2)/2) == pi/4 ?

I think so.

When you are debugging a calculator that's what you want.
People spend a lot of time trying to get the math library to give the expected results when known equations combining various operations are used. That is part of the sanity check of making sure that you have a good math library implementation.

Scientists using the math library want the math to be as close to real math as possible.

That was part of the point of the IEEE floating point standard.

sin (n * pi) == 0 etc.

Back in the day I knew W. Kahan (IEEE FP inventor) and I know that this was his belief.

When you are debugging a calculator that's what you want.

I thought that most calculators used fixed-point or arbitrary precision
math...

Scientists using the math library want the math to be as close to real math

as possible.

Scientists that don't understand FP well write bad scientific code.
Computers can't do math, sorry.

Relying on a supposedly identical sequence of FP operations to generate the
same output is one thing, relying on computers doing abstract math is a bit
of a jump...

I know of a few examples where well written crude models easily outperform
(in speed and quality) in-depth badly written models.

cheers,
--renato

[...]
Giving more-than-expected precision can be just as bad for the user as less.
It tends to come up in situations where the optimization would break some
symmetry, the same way that aggressively forming FMAs can break user code.
[...]

It boils down to the fact that giving excess precision in
some-places-but-not-others can lead to bad behavior.

Ok, I didn't think about excess precision resulting in problems.
Now it's clear to me that fast-math is neccessary for this optimization.

BTW: Is there a way to only get fast-math optimizations that don't change
behaviour on NaNs and Infs? I think it would be an interesting possibility to
allow arithmetic tranformations but only those preserving NaN and infinity
bahavior to catch errors/corner cases.

At least the pow(sqrt(x),2)=>x optimization falls into this category as I
wrote before.

--Owen

Christoph

I’d like to second that it would be nice to have an option for fast-math optimizations that don’t change NaN and inf. When I was working on rendering systems, optimizations that subtly changed precision were fine but we relied on inf behavior.

-Warren