Floating point semantic modes

Hi all,

I’m trying to put together a set of rules for how the various floating point semantic modes should be handled in clang. A lot of this information will be relevant to other front ends, but the details are necessarily bound to a front end implementation so I’m framing the discussion here in terms of clang. Other front ends can choose to follow clang or not. The existence of this set of semantics is an LLVM property that applies to all front ends, but the front ends will have to do something to initialize them.

I will eventually do something to convert this into an RST document and find a home for it in the clang documentation, but I’d like to start by getting input on whether everyone agrees with my judgment on how these things should work and whether I’ve missed anything.

Here’s what I’ve got.

As far as this part, ping on https://reviews.llvm.org/D69978
The input mode and output modes are really different settings

-Matt

I don’t know if LLVM supports it, but RISC-V as well as a few other architectures also support to-nearest-ties-to-max-magnitude.

Jacob Lifshay

I don’t know if LLVM supports it, but RISC-V as well as a few other architectures also support to-nearest-ties-to-max-magnitude.

Yeah, that’s come up a few times in reviews. We should start factoring it into our plans, but from a clang point of view, I’m not sure how you’d get into that mode.

We don’t have it in the list of possible rounding modes for the constrained intrinsics either. We should fix that.

-Andy

Hi,

I’m trying to put together a set of rules for how the various floating point semantic modes should be handled in clang.

Great to see this. Thanks!

Regarding the aspect:

=========================

Code-visible identifiers

=========================

FAST_MATH

This symbol will only be defined if and only if all of the following are set (before pragmas are applied):

except_behavior { ignore }

fenv_access { off }

rounding_mode { tonearest }

contract { fast }

denormal_fp_math { PreserveSign }

denormal_fp32_math { PreserveSign }

support_math_errno { off }

no_honor_nans { on }

no_honor_infinities { on }

no_signed_zeros { on }

allow_reciprocal { on }

allow_approximate_fns { on }

allow_reassociation { on }

This makes logical sense to me. That said, it’s different than what GCC does. GCC defines FAST_MATH irrespective of the settings of:

contract (-ffp-contract={on|off|fast})

allow_reassociation (-f[no-]associative-math)

allow_reciprocal (-f[no-]reciprocal-math)

rounding_mode (-f[no-]rounding-math)

(See https://reviews.llvm.org/D72675#1829810.)

I had previously thought we should be compatible with GCC (unless we find that GCC’s behavior is viewed by them as a bug). But after reading this, the simple rule of defining FAST_MATH when all of the collective flags that -ffast-math defines are set appropriately, makes more sense to me. In any case, I thought we should overtly note the difference here, in case anyone objects.

Thanks,

Nice database, definitely worth to be placed into documentation.

FLT_ROUNDS

Should be set to -1 (indeterminable) if rounding_mode() is dynamic or 1 (tonearest) if rounding_mode is tonearest. There are values for other rounding modes, but clang offers no way to set those rounding modes.

I would remove this symbol from this database. According to the C11 standard (5.2.4.2.2p8):

Evaluation of FLT_ROUNDS correctly reflects any execution-time change of rounding mode through

the function fesetround in <fenv.h>.

So on a target which is capable of reading rounding mode it must report actual rounding mode and never -1. It is actually a wrapper over fegetround.

* Kaylor, Andrew via llvm-dev <llvm-dev@lists.llvm.org> [2020-01-27 23:24:10 +0000]:

Hi all,

I'm trying to put together a set of rules for how the various floating point semantic modes should be handled in clang. A lot of this information will be relevant to other front ends, but the details are necessarily bound to a front end implementation so I'm framing the discussion here in terms of clang. Other front ends can choose to follow clang or not. The existence of this set of semantics is an LLVM property that applies to all front ends, but the front ends will have to do something to initialize them.

I will eventually do something to convert this into an RST document and find a home for it in the clang documentation, but I'd like to start by getting input on whether everyone agrees with my judgment on how these things should work and whether I've missed anything.

Here's what I've got.

i'm not an llvm/clang dev, i hope this mail wont bounce.

======================
FP semantic modes

except_behavior { ignore, strict, may_trap }
fenv_access { on, off }
rounding_mode { dynamic, tonearest, downward, upward, towardzero }
contract { on, off, fast }
denormal_fp_math { IEEE, PreserveSign, PositiveZero }
denormal_fp32_math { IEEE, PreserveSign, PositiveZero }
support_math_errno { on, off }

note that math errno handling can be

1) errno is set,
2) errno may be set and
3) errno is guaranteed to be untouched

iso c math_errhandling can select between 1 and 2,
(user can or cannot rely on errno) but for optimizing
math calls as side-effect-free pure functions, 3 is
needed.

-f(no-)math-errno selects between 1 and 3.
with 3, moving math calls across errno checks or
calls that set errno can break semantics depending
on how libm is implemented (e.g. glibc will set
errno independently of how you compiled your code).

no_honor_nans { on, off }

ideally there would be a way to support snan too.
(e.g. isnan(x) cannot be turned into x!=x then)

no_honor_infinities { on, off }
no_signed_zeros { on, off }
allow_reciprocal { on, off }
allow_approximate_fns { on, off }
allow_reassociation { on, off }

excess precision handling is missing from this list
which matters for x87 and m68k fpu support and may
matter for _Float16 implementations that fall back
to _Float32 arithmetic.

the granularity of these knobs is also interesting
(expression, code block, function or translation unit),
iso c pragmas work on code block level.

======================
FP models

-----------------------
precise (default)
-----------------------
except_behavior { ignore }
fenv_access { off }
rounding_mode { tonearest }
contract { on }
denormal_fp_math { IEEE }
denormal_fp32_math { IEEE }
support_math_errno { on }
no_honor_nans { off }
no_honor_infinities { off }
no_signed_zeros { off }
allow_reciprocal { off }
allow_approximate_fns { off }
allow_reassociation { off }

------------------
strict
------------------
except_behavior { strict }
fenv_access { on }
rounding_mode { dynamic }
contract { off }
denormal_fp_math { IEEE }
denormal_fp32_math { IEEE }
support_math_errno { on }
no_honor_nans { off }
no_honor_infinities { off }
no_signed_zeros { off }
allow_reciprocal { off }
allow_approximate_fns { off }
allow_reassociation { off }

------------------
fast
------------------
except_behavior { ignore }
fenv_access { off }
rounding_mode { tonearest }
contract { fast }
denormal_fp_math { PreserveSign }
denormal_fp32_math { PreserveSign }
support_math_errno { off }
no_honor_nans { on }
no_honor_infinities { on }
no_signed_zeros { on }
allow_reciprocal { on }
allow_approximate_fns { on }
allow_reassociation { on }

======================
Command-line options

-ffp-model={precise|strict|fast}
  Sets all semantic modes as described above.

-ffast-math
  Equivalent to -ffp-model=fast. (I'm not sure that's currently true.)

-f[no-]math-errno
-ffp-contract={on|off|fast}
-f[no-]honor-infinities
-f[no-]honor-nans
-f[no-]associative-math
-f[no-]reciprocal-math
-f[no-]signed-zeros
-f[no-]trapping-math
-f[no-]rounding-math
-fdenormal-fp-math={ieee, preservesign, positivezero}
-fdenormal-fp-math-fp32={ieee, preservesign, positivezero}
-ffp-exception-behavior={ignore,maytrap,strict}
  Each of these has a 1-to-1 correspondance to an FP semantic mode.
  (I think several of these should set "except_behavior" to "ignore".)

-ftrapping-math vs -ffp-exception-behaviour=maytrap
is unclear.

(-ftrapping-math is weird in gcc, it does not handle
all fp exception cases, not sure what clang plans to
do with that)

-f[no-]finite-math-only
  Controls no_honor_nans and no_honor_infinities.

-f[no-]unsafe-math-optimizations
  Turns no_signed_zeros, allow_reciprocal, allow_approximate_fns, and allow_reassociation on or off.
  Also, sets except_behavior to "on" for -funsafe-math-optimizations.
  (Currently, -fno-]unsafe-math-optimizations clears except_behavior, but I regard this as a bug.)

All command line options will override any previous values of all settings they control with options taking effect in a left-to-right manner.

======================
pragmas

STDC FENV_ACCESS {ON|OFF}
  Patch in progress. I think ON should force the following:

    except_behavior { strict }
    fenv_access { on }
    rounding_mode { dynamic }
    denormal_fp_math { IEEE }
    denormal_fp32_math { IEEE }
    no_signed_zeros { off }
    allow_reciprocal { off }
    allow_approximate_fns { off }
    allow_reassociation { off }

  And OFF should set fenv_access to off, except_behavior to ignore, and rounding_mode to tonearest. Other modes should be reset to their command line defined settings.

  I don't think this pragma should have any effect on contract, support_math_errno, no_honor_nans, or no_honor_infinities.

STDC FP_CONTRACT {ON|OFF|DEFAULT}
  This pragma controls the contract FP semantic mode. No other FP semantic modes are effected.

float_control ({precise|except}, {on|off}[, push])
float_control (pop)
  Patch in progress. These are tricky.
  I think they should have the following effects:

float_control (precise, on[, push])
  contract { on }
  denormal_fp_math { IEEE }
  denormal_fp32_math { IEEE }
  no_signed_zeros { off }
  allow_reciprocal { off }
  allow_approximate_fns { off }
  allow_reassociation { off }

float_control (precise, off[, push])
  contract { fast }
  denormal_fp_math { preservesign }
  denormal_fp32_math { preservesign }
  no_signed_zeros { on }
  allow_reciprocal { on }
  allow_approximate_fns { on }
  allow_reassociation { on }

Note, this is less than what the -ffp-model=precise control does. Should this override support_math_errno, no_honor_nans, or no_honor_infinities?

float_control (except, on[, push])
  except_behavior { strict }

float_control (except, off[, push])
  except_behavior { ignore }

The MSVC documentation says you can only use the float_control pragma to turn exception semantics on when precise semantics are enabled. For us, this would mean:
  denormal_fp_math { IEEE }
  denormal_fp32_math { IEEE }
  no_signed_zeros { off }
  allow_reciprocal { off }
  allow_approximate_fns { off }
  allow_reassociation { off }

The MSVC documentation also says you can't use the float_control pragma to turn excpetion semantics off when precise semantics are enabled, and you can't use the float_control pragma to turn precise off when fenv_access is on.

I believe we should follow the MSVC restrictions.

=========================
Code-visible identifiers

__FAST_MATH__

This symbol will only be defined if and only if all of the following are set (before pragmas are applied):
  except_behavior { ignore }
  fenv_access { off }
  rounding_mode { tonearest }
  contract { fast }
  denormal_fp_math { PreserveSign }
  denormal_fp32_math { PreserveSign }
  support_math_errno { off }
  no_honor_nans { on }
  no_honor_infinities { on }
  no_signed_zeros { on }
  allow_reciprocal { on }
  allow_approximate_fns { on }
  allow_reassociation { on }

__FINITE_MATH_ONLY__

This symbol will only be defined if and only if all of the following are set (before pragmas are applied):
  no_honor_nans { on }
  no_honor_infinities { on }

FLT_ROUNDS

Should be set to -1 (indeterminable) if rounding_mode() is dynamic or 1 (tonearest) if rounding_mode is tonearest. There are values for other rounding modes, but clang offers no way to set those rounding modes.

FLT_EVAL_METHOD

Should be set to -1 if any of allow_reciprocal, allow_approximate_fns, or allow_reassociation is set. Should any other flags also make this -1? Otherwise, the setting is target-defined.

math_errhandling

The MATH_ERRNO bit will be set or cleared based on the setting of support_math_errno. Should MATH_ERREXCEPT be set or cleared based on except_behavior?

FLT_ROUNDS, FLT_EVAL_METHOD and math_errhandling

are controlled by the c runtime, so a compiler has no business
changing them, the compiler can define its own __FLT_ROUNDS,
etc macros and the libc may or may not use those, but e.g.
in case of FLT_ROUNDS it makes no sense for the compiler to
try to do anything: the mode changes at runtime, the libc macro
will expand to a function call that determines the current
rounding mode. (same problem arises if you can change the
other modes on a per function or code block granularity.)

and i don't think it's a good idea to change FLT_EVAL_METHOD
with non-precise arithmetic modes, because it is used to decide
if excess range and precision is available, but arithmetic
changes don't affect that. (e.g. float_t is still same as float).

About ftrapping-math:
I think we should eliminate ftrapping-math, a boolean option, because it overlaps with ffp-exception-behavior, a 3 valued option. Or we can keep it in the clang driver for compatibility, but it should be rewritten by clang driver into ffp-exception-behavior=ignore and ffp-exception-behavior=strict. There are various fields in llvm and/or clang that maintain Boolean TrappingMath, those should be removed/rewritten.

About ffp-contract:
Currently the clang driver, at default setting, doesn't specify a value for ffp-contract, nothing is passed through to llvm in LangOpts or CodegenOpts, digging around in llvm I find that the default setting in llvm is "Standard"==on just like you said. Shall we change clang to always pass through the option values rather than relying on llvm default? To me it seems more secure to have all settings exactly specified.
    static cl::opt<llvm::FPOpFusion::FPOpFusionMode> FuseFPOps(
    "fp-contract", cl::desc("Enable aggressive formation of fused FP ops"),
    cl::init(FPOpFusion::Standard),
    cl::values(
        clEnumValN(FPOpFusion::Fast, "fast", "Fuse FP ops whenever profitable"),
        clEnumValN(FPOpFusion::Standard, "on", "Only fuse 'blessed' FP ops."),
        clEnumValN(FPOpFusion::Strict, "off",
                   "Only fuse FP ops when the result won't be affected.")));

About math-errno:
There are comments in the clang code that describe the settings for math-errno as toolchain dependent. Here's the initialization:
  // -fmath-errno is the default on some platforms, e.g. BSD-derived OSes.
  bool MathErrno = TC.IsMathErrnoDefault();
  Here's another comment,
      // Turning *off* -ffast-math restores the toolchain default.
      MathErrno = TC.IsMathErrnoDefault();

I have a strong preference for the latter. Command-line compatibility
is nice when porting makefiles from GCC.

* Blower, Melanie I <melanie.blower@intel.com> [2020-01-28 19:24:24 +0000]:

About math-errno:
There are comments in the clang code that describe the settings for math-errno as toolchain dependent. Here's the initialization:
  // -fmath-errno is the default on some platforms, e.g. BSD-derived OSes.
  bool MathErrno = TC.IsMathErrnoDefault();

that's interesting since the bsd libm does not touch errno
(at least on freebsd), so for them this creates significant
overhead (e.g. when sqrt is inlined, the compiler still emits
code to check the input and fall back to the libc sqrt call
just in case it wants to set errno)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37073

  Here's another comment,
      // Turning *off* -ffast-math restores the toolchain default.
      MathErrno = TC.IsMathErrnoDefault();

i guess the default can be tied to the libm behaviour of the
target (but ideally it will be the same everywhere so it's
easier to get portable behaviour)

(i personally would get rid of math errno, it causes problems,
but i think the gcc/glibc position is that c89 only had errno,
no fenv exceptions, so it should be supported for bw compat)

...

======================

FP models

======================

-----------------------

precise (default)

-----------------------

except_behavior { ignore }

fenv_access { off }

rounding_mode { tonearest }

contract { on }

denormal_fp_math { IEEE }

denormal_fp32_math { IEEE }

support_math_errno { on }

no_honor_nans { off }

no_honor_infinities { off }

no_signed_zeros { off }

allow_reciprocal { off }

allow_approximate_fns { off }

allow_reassociation { off }

------------------

strict

------------------

except_behavior { strict }

fenv_access { on }

rounding_mode { dynamic }

contract { off }

denormal_fp_math { IEEE }

denormal_fp32_math { IEEE }

support_math_errno { on }

no_honor_nans { off }

no_honor_infinities { off }

no_signed_zeros { off }

allow_reciprocal { off }

allow_approximate_fns { off }

allow_reassociation { off }

------------------

fast

------------------

except_behavior { ignore }

fenv_access { off }

rounding_mode { tonearest }

contract { fast }

denormal_fp_math { PreserveSign }

denormal_fp32_math { PreserveSign }

support_math_errno { off }

no_honor_nans { on }

no_honor_infinities { on }

no_signed_zeros { on }

allow_reciprocal { on }

allow_approximate_fns { on }

allow_reassociation { on }

I misunderstood the purpose of -ffp-model=fast. I assumed that was a
*trap-safe mode* where we care more about performance than strict
IEEE-754 conformance. I.e., I would like a mode where *hard*
trap-unsafe optimizations (e.g. hoisting) are disabled, but *soft*
trap-unsafe optimizations (e.g. reassociation, optimized math libs)
are performed. The thinking is that Invalid, DivisionByZero, and
Overflow are the most important traps, but I'm willing to give up some
edge cases around Overflow.

I suppose that having orthogonal FMFs is a good start, but that would
require updating Reassociate et al. to handle the constrained
intrinsics. From my off-line discussion with Andy, it doesn't sound
like that was a consideration.

Would anyone else like to see a heavily optimized (risky) trap-safe
mode like this?

* Blower, Melanie I <melanie.blower@intel.com> [2020-01-28 19:24:24 +0000]:
> About math-errno:
> There are comments in the clang code that describe the settings for math-errno as toolchain dependent. Here's the initialization:
> // -fmath-errno is the default on some platforms, e.g. BSD-derived OSes.
> bool MathErrno = TC.IsMathErrnoDefault();

that's interesting since the bsd libm does not touch errno
(at least on freebsd), so for them this creates significant
overhead (e.g. when sqrt is inlined, the compiler still emits
code to check the input and fall back to the libc sqrt call
just in case it wants to set errno)

37073 – -fno-math-errno should be the default on FreeBSD

> Here's another comment,
> // Turning *off* -ffast-math restores the toolchain default.
> MathErrno = TC.IsMathErrnoDefault();

i guess the default can be tied to the libm behaviour of the
target (but ideally it will be the same everywhere so it's
easier to get portable behaviour)

(i personally would get rid of math errno, it causes problems,

Agreed. Worrying about it isn't pragmatic. And we've made it this far
without proper error handling...

STDC FENV_ACCESS {ON|OFF}

Yes, you’re probably right about this. I was originally thinking of FENV_ACCESS as a fully strict mode of operation, but what you’re suggesting aligns with what Cameron suggested and even some of my own reasoning on other points. So, let me amend my previous proposal to say:

STDC FENV_ACCESS {ON|OFF}
Patch in progress. I think ON should force the following:

except_behavior { strict }
fenv_access { on }
rounding_mode { dynamic }
Other modes should be unchanged.

Thanks,

Andy

Yes, you’re probably right about this. I was originally thinking of FENV_ACCESS as a fully strict mode of operation, but what you’re suggesting aligns with what Cameron suggested and even some of my own reasoning on other points. So, let me amend my previous proposal to say:

STDC FENV_ACCESS {ON|OFF}
  Patch in progress. I think ON should force the following:

    except_behavior { strict }
    fenv_access { on }
    rounding_mode { dynamic }
  Other modes should be unchanged.

Does that apply to -ffp-model=strict too?

If FMFs are really orthogonal to -ffp-model=, then we shouldn't be
setting default values for the FMFs.

No, the fp-model options are intentionally umbrella/convenience options that set everything. So, -fp-model=strict is intended to provide exactly the "fully strict" mode that I was incorrectly associating with FENV_ACCESS. That is, the FMFs are orthogonal to fenv_access, but no fp modes are orthogonal to fp-model.

You can, of course, override individual settings from the fp-model. For instance, "-fp-model=strict -fassociative-math" could be allowed.

Ok, that's good enough. Thanks for the clarification.

... math errno ...

I wouldn't recommend to anyone that they should rely on math errno (because I don't trust libraries to correctly support it). My goal here was to incorporate our existing support for it into the rest of what I'm trying to document.

My understanding is that for clang this primarily controls whether or not we feel free to substitute intrinsics for recognized math library calls. I don't know if we have any code in the optimizer that introduces access to or modification of errno in the user's program. The library calls that test should always act as barriers to one another.

ideally there would be a way to support snan too. (e.g. isnan(x) cannot be turned into x!=x then)

The except_behavior mode is supposed to handle this. The LLVM support for constrained intrinsics is considering all manner of FP exceptions that could be raised, including the distinction between QNaN and SNaN. The default LLVM IR definition does not support this distinction.

We seem to have an issue with isnan() in clang though. If you call isnan() you get a call to __isnan() which should be fine (assuming the library does the right thing), but we're translating __builtin_isnan() to x!=x. That's not what we should be doing if except_behavior isn't "ignore".

excess precision handling is missing from this list which matters for x87 and m68k fpu support and may matter for _Float16 implementations that fall back to _Float32 arithmetic.

Yeah, we don't currently have any support for controlling that, at least in the x87 case. I think our current strategy is nothing more than setting FLT_EVAL_METHOD to reflect that we might not be using source precision for intermediate results. This is something we should consider adding.

the granularity of these knobs is also interesting (expression, code block, function or translation unit), iso c pragmas work on code block level.

I'd have to defer to someone more familiar with the front end to say how that is handled.

-ftrapping-math vs -ffp-exception-behaviour=maytrap is unclear.

The "maytrap" setting is supposed to prevent the optimizer from introducing spurious exceptions (e.g. speculative execution) while still allowing it to optimize away potential exceptions (e.g. dce operations whose results are never used).

FLT_ROUNDS, FLT_EVAL_METHOD and math_errhandling are controlled by the c runtime, so a compiler has no business changing them, the compiler can define its own __FLT_ROUNDS, etc macros and the libc may or may not use those, but e.g. in case of FLT_ROUNDS it makes no sense for the compiler to try to do anything: the mode changes at runtime, the libc macro will expand to a function call that determines the current rounding mode. (same problem arises if you can change the other modes on a per function or code block granularity.)

I don't think I understand what you're saying here. FLT_ROUNDS, in particular, I thought was supposed to be implemented without reference to the runtime library. In clang we're mapping this to an intrinsic that gets a target-specific inline expansion. For example: Compiler Explorer

And FLT_EVAL_METHOD I take to be an indicator of how the compiler is handling intermediate results.

BTW There's no clang option for allow_approximate_fns; there is this option:

def fcuda_approx_transcendentals : Flag<["-"], "fcuda-approx-transcendentals">,
  Flags<[CC1Option]>, HelpText<"Use approximate transcendental functions">;
def fno_cuda_approx_transcendentals : Flag<["-"], "fno-cuda-approx-transcendentals">;

Should this option be added generally for clang?

* Kaylor, Andrew <andrew.kaylor@intel.com> [2020-01-29 23:54:46 +0000]:

> ideally there would be a way to support snan too. (e.g. isnan(x) cannot be turned into x!=x then)

The except_behavior mode is supposed to handle this. The LLVM support for constrained intrinsics is considering all manner of FP exceptions that could be raised, including the distinction between QNaN and SNaN. The default LLVM IR definition does not support this distinction.

We seem to have an issue with isnan() in clang though. If you call isnan() you get a call to __isnan() which should be fine (assuming the library does the right thing), but we're translating __builtin_isnan() to x!=x. That's not what we should be doing if except_behavior isn't "ignore".

supporting exceptions and snan should be separate
as exception support is required by iso c annex f,
while snan is not (gcc has separate option for snan)

the assumption of the exception support should be
that all nan values are qnan unless snan support is
turned on.

> FLT_ROUNDS, FLT_EVAL_METHOD and math_errhandling are controlled by the c runtime, so a compiler has no business changing them, the compiler can define its own __FLT_ROUNDS, etc macros and the libc may or may not use those, but e.g. in case of FLT_ROUNDS it makes no sense for the compiler to try to do anything: the mode changes at runtime, the libc macro will expand to a function call that determines the current rounding mode. (same problem arises if you can change the other modes on a per function or code block granularity.)

I don't think I understand what you're saying here. FLT_ROUNDS, in particular, I thought was supposed to be implemented without reference to the runtime library. In clang we're mapping this to an intrinsic that gets a target-specific inline expansion. For example: Compiler Explorer

And FLT_EVAL_METHOD I take to be an indicator of how the compiler is handling intermediate results.

the FLT_ROUNDS value can be inlined if fenv access is
off or changing the rounding mode is not supported,
otherwise it should expand to a runtime check which
can be a libc call (which is what glibc does nowadays).

FLT_EVAL_METHOD is trickier because of ts-18661, a libc
(or platform) may decide e.g. not to support _Float16
independently of what the compiler is doing and that
affects the value of FLT_EVAL_METHOD.

in any case i think it's better if the compiler had
predefined __FLT_EVAL_METHOD etc macros that a libc
may use in float.h if it decides to. (in hosted
mode the compiler float.h may not be used at all)