I am trying to implement interval arithmetic through llvm. I have a problem with the rounding mode with llvm.experimental.constrained.cos
I have two pieces of codes:
; Function Attrs: norecurse nounwind readnone ssp uwtable
define double @cosine_down(double returned) local_unnamed_addr #0 {
; call the llvm intrinsic to perform downward cosine
%2 = call double @llvm.experimental.constrained.cos(double %0, metadata !"round.downward", metadata !"fpexcept.strict")
ret double %2
}
; Function Attrs: norecurse nounwind readnone ssp uwtable
define double @cosine_up(double returned) local_unnamed_addr #0 {
; call the llvm intrinsic to perform upward cosine
%2 = call double @llvm.experimental.constrained.cos(double %0, metadata !"round.upward", metadata !"fpexcept.strict")
ret double %2
}
When calling the function on a test number: 0.79358805865013693
The two functions return the same value: 0.7012920012119437.
Ideally, the two functions will give me the upper and lower bound of where the true value will lie within, so maybe 0.7012920012119436 to 0.7012920012119438, but it seems like even with the upward and downward rounding mode that’s not the case.
The rounding mode argument to the intrinsic is supposed to be a hint to the
compiler to tell us what the rounding mode has been changed to. In theory
we could use this to constant fold the intrinsic with correct rounding but
I don't think much of that is implemented. The compiler does not add any
code to change the rounding mode in the hardware floating point unit in
response to the intrinsic. It's up to the user to call something like
fesetround in libm to change the rounding mode in hardware. Changing the
rounding mode in hardware could have an effect on the result returned by
the cos library function that we will ultimately call, but I don't know for
sure.
Thanks for the reply. Now I put a set rounding mode before the corresponding code, but strange things happen.
With the same input, the rounding down code returns larger value than the rounding up value with cos.
Initially I thought it might because the input somehow got rounded down, and since the input is between 0 and pi, which is on a downward slope, rounding down the input will increase the value hence the error. Then I tested some value larger than pi but smaller than 2pi, which has positive derivative, but the error still happens. I think there might be something wrong with the cosine intrinsic?
I’ve attached the code at the end, and before each call, the rounding mode is set back to original (round to nearest).
I'm not sure how llvm.cos is implemented on your platform, but in most
cases the intrinsic is directly lowered to a math library call, in
which it's calculated by some algorithm, not single instruction. (for
example, see the cos implementation in musl: https://github.com/bminor/musl/blob/master/src/math/cos.c)
If I understand correctly, what hardware rounding mode specifies - is
how the immediate result of each instruction (fadd, fma, etc.). It
can't guarantee that result of complicated calculation rounds as you
expect.
The LangRef does specify that the rounding mode/exception behavior is advisory, not mandatory, although it could probably be spelled out better.
That said, there are some architectures where rounding mode can be configured on a per-instruction basis (e.g., AVX-512 has the SAE bits that serve this purpose, and I believe the various GPU architectures follow suit). It may be worth having some support for this in LLVM.