constrained cosine rounding mode behavior

Hi:

I am trying to implement interval arithmetic through llvm. I have a problem with the rounding mode with llvm.experimental.constrained.cos

I have two pieces of codes:

; Function Attrs: norecurse nounwind readnone ssp uwtable
define double @cosine_down(double returned) local_unnamed_addr #0 {
; call the llvm intrinsic to perform downward cosine
%2 = call double @llvm.experimental.constrained.cos(double %0, metadata !"round.downward", metadata !"fpexcept.strict")
ret double %2
}

; Function Attrs: norecurse nounwind readnone ssp uwtable
define double @cosine_up(double returned) local_unnamed_addr #0 {
; call the llvm intrinsic to perform upward cosine
%2 = call double @llvm.experimental.constrained.cos(double %0, metadata !"round.upward", metadata !"fpexcept.strict")
ret double %2
}

When calling the function on a test number: 0.79358805865013693

The two functions return the same value: 0.7012920012119437.

Ideally, the two functions will give me the upper and lower bound of where the true value will lie within, so maybe 0.7012920012119436 to 0.7012920012119438, but it seems like even with the upward and downward rounding mode that’s not the case.

I noted on the page https://llvm.org/docs/LangRef.html#llvm-experimental-constrained-cos-intrinsic it says This function returns the cosine of the specified operand, returning the same values as the libm cos functions would

So does this mean that constrained cos does not care about the rounding mode but instead will just return the same value?

Thank you
Xuan Tang

Hi Xuan Tang,

The rounding mode argument to the intrinsic is supposed to be a hint to the
compiler to tell us what the rounding mode has been changed to. In theory
we could use this to constant fold the intrinsic with correct rounding but
I don't think much of that is implemented. The compiler does not add any
code to change the rounding mode in the hardware floating point unit in
response to the intrinsic. It's up to the user to call something like
fesetround in libm to change the rounding mode in hardware. Changing the
rounding mode in hardware could have an effect on the result returned by
the cos library function that we will ultimately call, but I don't know for
sure.

~Craig

Hi Craig:

Thanks for the reply. Now I put a set rounding mode before the corresponding code, but strange things happen.

With the same input, the rounding down code returns larger value than the rounding up value with cos.

Initially I thought it might because the input somehow got rounded down, and since the input is between 0 and pi, which is on a downward slope, rounding down the input will increase the value hence the error. Then I tested some value larger than pi but smaller than 2pi, which has positive derivative, but the error still happens. I think there might be something wrong with the cosine intrinsic?

I’ve attached the code at the end, and before each call, the rounding mode is set back to original (round to nearest).

; Function Attrs: ssp uwtable
define double @cosine_down(double returned) local_unnamed_addr #0 {
%2 = call i32 @fesetround(i32 1024)
%3 = call double @llvm.experimental.constrained.cos(double %0, metadata !"round.downward", metadata !"fpexcept.strict")
ret double %3
}

; Function Attrs: ssp uwtable
define double @cosine_up(double returned) local_unnamed_addr #0 {
%2 = call i32 @fesetround(i32 2048)
%3 = call double @llvm.experimental.constrained.cos(double %0, metadata !"round.upward", metadata !"fpexcept.strict")
ret double %3
}

Test result:

Input interval from 0.7935880586501369 to 0.7935880586501369, cosine result from 0.7012920012119436 to 0.7012920012119435
is empty: true

Input interval from 3.17588058650137 to 3.17588058650137, cosine result from -0.9994122264169831 to -0.9994122265888515
is empty: true

Thank you
Xuan Tang

Hi Xuan Tang

I'm not sure how llvm.cos is implemented on your platform, but in most
cases the intrinsic is directly lowered to a math library call, in
which it's calculated by some algorithm, not single instruction. (for
example, see the cos implementation in musl:
https://github.com/bminor/musl/blob/master/src/math/cos.c)

If I understand correctly, what hardware rounding mode specifies - is
how the immediate result of each instruction (fadd, fma, etc.). It
can't guarantee that result of complicated calculation rounds as you
expect.

Regards,
Chaofan

Xuan Tang via llvm-dev <llvm-dev@lists.llvm.org> 于2020年9月10日周四 上午3:36写道:

The LangRef does specify that the rounding mode/exception behavior is advisory, not mandatory, although it could probably be spelled out better.

That said, there are some architectures where rounding mode can be configured on a per-instruction basis (e.g., AVX-512 has the SAE bits that serve this purpose, and I believe the various GPU architectures follow suit). It may be worth having some support for this in LLVM.