Hello,
at the recent EuroLLVM developer meeting in Bristol I held a BoF
session on the topic “Towards implementing #pragma STDC FENV_ACCESS”.
I’ve also had a number of follow-on discussions both on-site in
Bristol and online since. This post is intended as a summary of
my current understanding set of requirements and implementation
details covering the overall topic.
I’m posting this here in the hope this can serve as a basis for
the various more detailed discussions that are still ongoing
(e.g. in various Phabricator proposals right now). Any comments
are welcome!
Semantics of #pragma STDC FENV_ACCESS
Hi Ulrich,
I am interested in knowing if the current proposals also take into account the FP_CONTRACT pragma and the ability to implement options that imply a specific value for the FLT_EVAL_METHOD macro.
Additionally, I am not aware of the IR being able to represent the potentially deferred loss of precision that the C language semantics provide; in particular, applying such semantics to the existing IR would hit an issue that the limits of such deferment would need an agreed representation.
As for the mixing of strict and non-strict modes, I would be interested in where LLVM is in its handling of non-SSA (pseudo-memory?) dependencies. I have a vague impression that it is very coarse-grained in that respect, but I admit to not being particularly informed in that space. If there is a good model for such dependencies, then I think it could be used to handle the strict/non-strict mixing.
– Hubert Tong, IBM
PS A nitpick on wording: The idea of being inside or outside of FENV_ACCESS regions is instead be expressed in terms of the state of the FENV_ACCESS pragma within the C Standard.
We should already do this (we turn relevant operations into the @llvm.fmuladd. when FP_CONTRACT is set to on during IR generation). What do you mean by this? -Hal
Hi Ulrich,
I am interested in knowing if the current proposals also take into account
the FP_CONTRACT pragma
We should already do this (we turn relevant operations into the
@llvm.fmuladd. when FP_CONTRACT is set to on during IR generation).
I am not sure we have the same interpretation of what the FP_CONTRACT
pragma does. Subclause 6.5 paragraph 8 of C11 implies (for example) that
even where the FENV_ACCESS pragma is "on", folding a constant subexpression
with an exactly representable result on an implementation where
FLT_EVAL_METHOD is 0 is within the range of acceptable
implementation-defined behaviour despite intermediate overflow under
non-contracted evaluation. Which is to say that the current proposal reads
as what needs to be done when FP_CONTRACT is "off" and FENV_ACCESS is "on".
The note from Ulrich implies that the requirements are imposed by the
Standard, but the range of implementation defined behaviour where
FP_CONTRACT is "on" where FENV_ACCESS is also "on" is possibly a discussion
to be had.
and the ability to implement options that imply a specific value for the
FLT_EVAL_METHOD macro.
What do you mean by this?
I admit that modes where FLT_EVAL_METHOD, respectively, is 0 (no extra
range and precision), 1 (float in double range and precision), and 2 (float
and double in long double range and precision) are all straightforward for
the IR producer to implement by fixing the types used in the IR emitted
(implying the value FLT_EVAL_METHOD is not constant within a program).
So, this is more about implementing meaningful cases of FLT_EVAL_METHOD
being -1. My point below (in my previous note) is that allowing IR passes
or the back-end to choose the range and precision in a manner conforming to
Standard C (for a FLT_EVAL_METHOD of -1)--perhaps for speed where multiple
sets of floating-point operations/registers are available with differing
"preferred types"--appears to be a use case that the IR does not seem to
support well. As for why a FLT_EVAL_METHOD of -1 is on-topic for this
thread: The language semantics allow the case of the constant subexpression
folding I mentioned above even when FP_CONTRACT is "off" and FENV_ACCESS is
"on", because the evaluation format used for the evaluation of that
subexpression can be said to have infinite range and precision.
Thanks for explaining. Yes, I agree, this is certainly worth discussing. Do you have thoughts on what we should do? I think it makes sense to fold where possible, as the user has requested the extra intermediate precision available from FMA formation. Also, to what extent can we change our minds later? For example, with C++/constexpr, etc. does this have ABI implications? Yes. In the LangRef we do have fpmath metadata (), which might be useful in this space, but I don’t think we actually use it for anything. An, interesting. FLT_EVAL_METHOD is a constant chosen (globally) by the implementation, correct? Do you know of platforms that set FLT_EVAL_METHOD to -1? -Hal
Hi Hubert,
I had not really thought about FP_CONTRACT. As Hal mentioned, LLVM uses FP_CONTRACT only to allow use of floating-point multiply-and-add, and this would remain valid even when FENV_ACCESS is on.
However, it seems that FP_CONTRACT on would also allow constant folding to be performed even when FENV_ACCESS is on. This is what you’re refering to, right? This should certainly be considered as option when implementing clang front-end support for FENV_ACCESS.
I have thought even less about using different values for FLT_EVAL_METHOD. However, one concern we found with this in the past (on s390x GCC) is that changing that value on an existing target may have ABI impact: glibc header files choose the definition of float_t and double_t based on the current value of FLT_EVAL_METHOD, and changing those types could result in an ABI break for applications using them in interfaces.
Finding a good model for non-SSA dependencies is indeed the main problem here. See e.g. the recent discussion here https://reviews.llvm.org/D45576 about possibly modeling FP status flags as “memory” at the MI level. I guess it might be possible to do the same at the IR level, but I’m less familiar with that part of LLVM.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
