LLVM recently adopted a guarantee that if all input-NaNs to a float operation are quiet, then any output NaN that is generated will also be quiet. I wonder if it would be possible to strengthen this further? Concretely I am looking for the following guarantee (inspired by wasm):
- Define a “canonical NaN” to be a NaN with arbitrary sign but a payload that has the most significant bit be 1 and everything else be 0. (This assumes that a “1” in the most significant bit indicates a quiet NaN; it needs to be tweaked to support targets where that is not the case. Also, I am aware that LLVM already has
canonicalize
on floats which refers to a different notion of “canonical”; I am using wasm terminology here.) - Now we guarantee that if all input NaNs to an operation are canonical, then any output NaN produced is also canonical.
A guarantee of this sort is required for code that uses NaN boxing, such as the SpiderMonkey JavaScript engine. If LLVM were to violate this guarantee, there is a chance that compiling SpiderMonkey with clang could lead to wrong results. Lucky enough, so far it seems like LLVM actually provides this guarantee: its apfloat softfloat library will pick a canonical NaN when producing new NaNs, and will forward one of the input NaN payloads when propagating NaNs. (apfloat’s FMA(0, inf, NaN)
might return a canonical NaN payload instead of propagating the input NaN payload, but while this doesn’t match hardware, it is still sufficient for this guarantee.) Most hardware currently in operation also matches this spec (at least x86, ARM, RISC-V), and the fact that wasm makes this requirement indicates that it is both useful and satisfied by a wide range of implementations.
So I wonder, is there any case where LLVM violates this guarantee? Would it be possible to have LLVM commit to providing this guarantee? In Rust we’d like to provide this guarantee to our users, but we are currently blocked on LLVM not documenting such a guarantee.
(wasm also guarantees that floating point operations never produce a signaling NaN, even if an input is a signaling NaN. That is not guaranteed by LLVM. I am not aware of problems caused by this behavior and I am not asking to change this.)