RFC: Adding the AMD/GraphCore/[maybe others] float8 formats to APFloat

AMD and GraphCore have developed an 8-bit floating point format that differs somewhat from the ones already added to APFloat. Since we have hardware (current or, in AMD’s case, upcoming) that supports this format, we need to represent it in APFloat so that we can add it to MLIR so we can target our 8-bit float instructions. (For example, an upcoming AMD GPU will have accelerated 8-bit floating point matrix math instructions which use the format I define, which are already present in the intrinsics table)

I have already published ⚙ D141863 [llvm][APFloat] Add NaN-in-negative-zero formats by AMD and GraphCore to add the formats to APFloat.

We’re posting this RFC to

  • Get an external reviewer on the patch
  • Raise awareness of alternate 8-bit floating point formats in order to prevent premature standardization on Nvidia’s proposal
  • Get comments on naming

The formats

In most respects, the formats we propose - Float8E5M2FNUZ and Float8E4M3FNUZ - are similar to the existing proposed types, most notably Float8E4M3FN.

Like Float8E4M3FN, we have one NaN pattern, which also serves as the infinity value. This means that overflow in either direction goes to NaN.

Unlike Float8E4M3FN, our formats have exactly one NaN value, which is unsigned, and uses the encoding normally used for negative zero. This means that, in our proposed formats, zero (and NaN) are unsigned. That is, all NaN values are represented by 0x80, and all zero values by 0x00.

This choice allows more of the 256 values an 8-bit float can take on to be used for numbers that are meaningful at these low precisions.

In addition, our formats use minimum exponents one smaller than those used by the existing Float8 formats. In the case of Float8E5M2FNUZ, this was acheived by dropping support for IEEE NaN and infinity, just like in Float8E4M3FN. For Float8E4M3FNUZ, this was a design choice - compared to Float8E4M3, we have a smaller overall range but higher precision.


I’ve mostly followed the existing format for float8 types - Float8E[exponent]M[mantisa bits][flags] . As in the existing format, the FN suffix is used for Finite, Nan-onlyand we have addedUZforUnsigned Zero`.


(For example, an upcoming AMD GPU will have accelerated 8-bit floating point matrix math instructions which use the format I define, which are already present in the intrinsics table )

Graphcore also has hardware support for these formats. The ISA for this is available here.

If I hear no objections, I plan to land this in seven days (Wednesday, February 8, 2023)

I have no objections and thank you for raising visibility on the patch. I think it is useful for APFloat to support all the reasonable formats in the wild, and this certainly qualifies. That said, a time delay threat isn’t a super great way to motivate reviewers to look at the patch :slight_smile:


Thanks, Chris!

I appreciate everyone’s feedback, especially in catching the typos that don’t show up as easily in screen reader output.

On the time delay, I got the idea from PSA: Retire Linalg fusion-on-memrefs over in MLIR, which was a “I’ll be landing this breaking change in N weeks ibarring objections”. If that sort of thing is bad form over on the LLVM side, my apologies.

1 Like

I think statements like “I plan to land this in N days (barring objections)” are a very valuable way of communicating. I’m pretty sure it wasn’t intended as a threat, and I don’t think it should be read as one.

Taking a bit of broader swipe here: LLVM as a software development project is notoriously bad at governance and is pretty much devoid of legible leadership. Like, horribly bad. The number of times people bring up suggestions and there’s discussion and then it just sort of fades out without a clear resolution is way too high. Folks just far too often can’t get a clear “Yes / No / Yes with Adjustments / No but take this Route” resolution.

Personally, I read the kind of timing announcement as a well-intentioned way of coordinating and setting expectations in front of this background of bad governance.

1 Like

The intent was likely to communicate “the patch is accepted and ready to land, but considering the wide impact and the fact that the RFC didn’t get much comments, I’d like to give it another week for people to chime in here if they have opinion or would like to delay landing this for more considerations. So please voice your interest if you have one!”

1 Like

Another factor was that, even though I had an “Accepted” on my patch, it was from someone invested in landing the feature who had themselves written an equivalent patch. I posted the RFC to get more attention from independent reviewers who might have an opinion on whether this should be in LLVM, whether the name was good, and so on.

I agree with @nhaehnle that there’s really not a clear answer to “what does it mean if no one says anything on an RFC?”. I’m not sure what the solution is here, but maybe it’d be a good idea to have Rust-style final comment periods as an explicit part of the “I’m looking to land a major chunk of code/break something” process and something less weighty - perhaps some guidance on how long to wait between asking for feedback and landing new features anyway.