I’ve been putting together a proposal on “fixing” the fast-math flags (at least, the flags that aren’t easily formalizable like nnan/ninf/nsz), so I don’t want to steer the discussion too much in that direction yet.
That’s more-or-less what I want with ‘nnan’ and ‘ninf’ – I want a weak form of these flags that enables semantic-breaking transformations without introducing any other changes to the semantics of instructions on which they are present.
I understand, appreciate, and sympathize with this ask. However, I am extremely cautious about replacing semantics that are well-defined and can be incorporated into a formal semantics tools with those that are handwave-y and vague. Prior experience with handwave-y semantics (e.g., pointer provenance) suggests that the end state is you end up with optimizations that individually make sense but combine together to make a clear error. Using reassoc as your example is I think an unintentional example of the problem–reassoc is one of the existing flags where I think the optimizations we are adding to it end up being surprising to users.
That said, I am not an expert in formal semantics; I would appreciate feedback from those who are (e.g., RalfJung or Nuno Lopes). What I can find on searching for formalizations of FP semantics is that there isn’t much precedent in the academic literature for formalization of fast-math flags in the vein of reassoc (nnan-like flags, because the specify the results the operation may give, is formalizable). The only example I see is one group that used a rewrite-based semantics for fast-math flags, and I’m not sure how reliable rewrite-based semantics are. (Especially because I doubt we can ever enumerate the permissible rewrites well enough).
This is going off in yet another direction, but I have a problem with ‘afn’ in that we don’t say anything about just how close the approximation needs to be. Can I replace ‘sin(x)’ with zero? It’s an approximation. I think we’d all agree that it’s not a sufficiently accurate approximation, but I don’t know that we have any rule for how much error is allowed.
Of the existing flags, afn
is the flags I propose should be scrapped in its current semantics entirely and replaced instead with one that generally allows algebraic expression transformations (which would cover most of its existing transformations and provide a better place for optimizations like sin(x)/cos(x) => tan(x)).
Going back to my original case with reciprocal approximation of division, we’re using Newton-Raphson refinement because just inserting the rcpps instruction isn’t close enough. The ‘arcp’ flag allows the use of a reciprocal, but an unwritten (I think) rule tells us that the answer needs refinement.
More broadly, I think there are few issues we’re running into right now:
- We have no general way of identifying acceptable precision loss. There is “recip-estimates” function attribute (which looks to be designed to be compatible with gcc’s -mrecip flag) which can generate use of approximate-reciprocal instructions, but it suffers from all the problems of function attributes for fast math.
- The “fast” flag is treated as equivalent to the conjunction of all the fast-math flags. This means any optimization you want to shove into the bucket of “trade FP precision for speed” kind of has to be defined as a conjunction of existing semantics, and there are several transformations which do not happily sit in existing flags.
- We are out of fast-math bits in llvm::Value. FMF sits in SubclassOptionalData, which has but 7 bits, and all 7 bits are defined.
- Existing fast-math flags are probably deviating from user expectations in uncomfortable ways.