Denormal-fp-math and fast-math

This has been discussed extensively in PR 80475, but I’d like to have a discussion here to capture what we’ve agreed upon there and clarify a few things I’m not sure about.

We have a function attribute, “denormal-fp-math” (and also “denormal-fp-math-f32” but I’m treating them as the same in this discussion). According to the LLVM Language Reference this attribute “indicates the denormal (subnormal) handling that may be assumed for the default floating-point environment.” My understanding is that this attribute isn’t intended to introduce any explicit actions that change the default denormal handling, such as setting the FTZ and DAZ bits in the MXCSR register on x86-based targets, but is rather a guide to optimizations and instruction selection telling them what transformations are allowed and what instructions may be selected to represent an operation.

In clang, we have a command-line option, -fdenormal-fp-math that controls this directly. There is, however, some ambiguity about how this relates to the various umbrella fast-math options. The Clang User’s Manual is currently unhelpful on this point, and the implementation is similarly sloppy.

A related issue is whether or not we link “crtfastmath.o”, which is what PR 80475 was intended to update. In that PR, I think we had a consensus that whether or not we are linking “crtfastmath.o” should not be connected to how the front end sets the “denormal-fp-math” attribute. Different functions can be compiled with different floating-point controls – even within the same compilation unit – and there is no way to know for any function what the dynamic denormal behavior will be for targets (such as x86) that have a dynamic control mode.

I think we have agreed on the following points:

  1. The default setting for denormal_fp_math is target dependent and the driver can’t make any assumptions about it
  2. The options -ffast-math, -fno-fast-math, -funsafe-math-optimizations, -fno-unsafe-math-optimizations, and -fp-model should not change the denormal_fp_math setting
  3. For targets where setting FTZ and DAZ is desirable. the driver should decide whether or not to link crtfastmath.o based on the -ffast-math, -fno-fast-math, -funsafe-math-optimizations, -fno-unsafe-math-optimizations, and -fp-model command-line options
  4. When compiling shared libraries, the driver should never link crtfastmath.o
  5. A new option, -m[no]daz-ftz may be used to override the default behavior of linking “crtfastmath.o”
  6. Even when we are linking “crtfastmath.o” the default setting for the “denormal-fp-math” attribute should be “IEEE”

This last point may not be obvious. The attribute has a “dynamic” setting which isn’t documented as a choice for the corresponding clang command-line option. The “dynamic” setting indicates that the compiler does not know whether denormals will be flushed to zero at execution time. We agreed that the default should be “IEEE” because this may enable some optimizations, and users who care about numeric consistency are likely to prefer “IEEE”. I’m not entirely convinced about this, but I can live with this if it is the consensus.

I’m working on a PR to fix the behavior of the -fp-model command-line option with regard to denormals and linking crtfastmath.o. I’ll update it once PR 80475 and tidy up the rest of the fast-math-related handling there too. I just have a couple of questions.

Should -fdenormal-fp-math=preserve-sign cause “crtfastmath.o” to be linked when available on platforms that support that even if no other fast-math options are used? Should -ffast-math -fdenormal-fp-math=ieee prevent linking “crtfastmath.o”?

Would it be better to insert an intrinsic in main() (or equivalent entry functions) to set FTZ/DAZ rather than linking “crtfastmath.o”?

Any other comments or corrections on my summary above?

The PR to fix the fp option rendering is now updated: Clean up denormal handling with -ffp-model, -ffast-math, etc. by andykaylor · Pull Request #89477 · llvm/llvm-project · GitHub