Add LLVM type support for fp8 data types (F8E4M3 and F8E5M2)

Following the previous RFC and implementation to add the FP8 data types to the APFloat and MLIR, we propose to add supporting code to LLVM in order to enable a CodeGen path for the two FP8 datatypes (F8E4M3 and F8E5M2). We have prepared a draft for the approach here: ⚙ D140088 Add LLVM type support for fp8.


The hardware support for the native FP8 conversions are vital to improve memory bandwidth and boost performance. Currently, native intrinsic conversion for FP8 is supported in the NVIDIA GPUs with Compute Capability SM90 (Hopper GPUs) or higher, resulting in a considerable performance boost and lowering the memory consumption. This requires special hardware instructions (e.g. cvt.rn.satfinite.e5m2x2.f16x2) and greatly benefits from recent support for FP8 data types in LLVM IR.


The proposed patch, integrates two FP8 data types into the IR backend, adding the necessary layouts and parser requirements, as well as necessary bitcode writers and reader pieces. This approach of adding a new floating type follows previous precedents used to integrate the BFloat16 and half data types. Also, the addition of the APFloat datatype in the preceding RFC leads naturally to the introduction of IR FP8 data types.


The main caveat concerning adding the new floating types using this approach is the lack of generalization and extensive code changes required. This has been brought up in the accompanying draft and has been discussed there to some extent. The alternative approach is to consolidate the FP types into a single parametrized one, easing the process of adding new data types and resulting in fewer code changes and more structured architecture moving forward (assuming yet further proliferation of FP types). That being said, a consolidated FP type would require a much larger overhaul, far beyond the changes proposed here. Thus, we propose proceeding with adding the FP8 type which is congruent with the existing structure and discuss the consolidated FP type approach in a separate RFC.


(assuming yet further proliferation of FP types)

It looks like two more floating-point types f8e4m3fzn and f8e5m2fzn are being added in ⚙ D141432 Add two additional float8 types to MLIR and APFloat., so it seems like further proliferation of FP types is a pretty safe bet.

Edit: With that in mind, your f8e4m3 type should probably be called f8e4m3fn on the IR level as well, as it looks like another variant of the same type is imminent.

My concerns:

  • Do we actually need 8-bit float types in IR? Most of the benefit of having dedicated types in IR (as opposed to just using i8 or a target extension type) comes from having floating-point operations on those types, but as far as I can tell, 8-bit types don’t support floating-point arithmetic.
  • Dealing with the code duplication issues seems more urgent if we’re adding 4 more floating-point types.
  • As mentioned on the review, these would be the first floating-point values in LLVM IR that don’t have infinity; I think we need to address more explicitly what work is required to deal with that.

Just realized this is a bit ambiguous. By this, I mean instruction sets that support 8-bit floats don’t have native 8-bit instructions for floating-point add/multiply/etc., only conversions and matrix ops.

I believe Clang calls them storage types. Don’t touch them by hand. You will need intrinsics for all operations.

The native conversion coming with SM90 is the reason behind the propagation. Offloading the conversion to the device Kernel helps 1- reducing the bandwidth by two folds 2- having a faster native conversion. That’s why we would like to push the the data type further down the pipeline.

1 Like

I used “storage-only types” with this meaning in a conversation about FP8 recently, and was corrected that “if there are intrinsics or these types can be passed by value, then they’re not storage-only”, so I’m not sure where this boundary lies. The passing-by-value is one of the first places that it’s important to handle these types reasonably because that can be ABI-breaking.

1 Like

Sure. There will be calling conventions for all the FP8 types. My point was that all arithmetic has to go through intrinsics.

If the type is storage only, then why simple i8 is not enough?

This a really good point. When I am allowed to perform arithmetic on i8s? When I am only allowed to use intrinsics?

Or this is the solution for the next 20 ML types:

I think there is a subtle boundary. E.g., on X86, we set __always_inline__ on all intrinsics. So it it not a problem that using “storage-only types” with them given they are always inlined.
No ABI problem if we don’t realy have function call with the type.
A similar example is __fp16 which is a storage-only type which is explicitly forbidden to be passed by value in the front-end. I don’t know what’s happening if declare intrinsics with the type, and I’m not FE expert, but I think there are ways to discriminate always inline intrinsics from normal functions.

+1 for this. Back to the time when supporting __fp16, although we introduced a new IR type half, we take it as i16 when handling ABI in backends (we don’t really have ABI defination at that time). This turns out to be a disaster when we plan to support the true ABI type _Float16 because they share the same half in the IR. We have to do ABI break update when supporting it.
So if we want to introduce a storage type in C/C++, using i8 in IR is a good choice. If we do want to introduce a new IR type, we’d better to declare its C/C++ type is ABI type. Similar to _Float16, we can only enable it for a few targets that have defined ABI for it.

Well, before half was introduced, it was just plain i16 with target-specific intrinsics to convert to / from float and all operations lowered by frontend as operations on float type. At the time it was introduced there were no platforms that supported “proper” half operations, it was ARM-only and storage-only. So, approach with half was to make it more “target neutral”, but it was not quite complete, yes.

So, I’d say we’d simply use i8 + intrinsics for all storage-only types. It seems to be enough for all purposes.

I would like to add that not all LLVM IR is generated by Clang. There is MLIR in-tree and many language frontends out of tree. To me the semantics of i8 become ambiguous. What about the FPGA guys that have a need for some F7? I believe that the opaque types are nice extension of LLVM. They support almost any type. With opaque there is no question that whether it is i8 or some custom F8 variant.

The interesting thing is the suggestion to use i8 is indeed to protect all frontends rather than Clang. The ABI break in above half example is a break to all frontends but Clang, because Clang had banned passing __fp16 already.
The underlying reason is the default calling conversion in LLVM is C calling conversion. Introduing a LLVM type before its ABI readiness is risky. An ideal way is to ban its value passing from all the frontends. Otherwise, the calling conversion is UB or implementation defined. No backward compatibilities are warranted.
I know nothing about opaque type. I guess it may do the prohibition as well. In that case, there will be no ABI issue then. We don’t need to distinguish i8 or f8 as long as they are not passed by value.

Opaque types are discussed here:

Thanks @tschuett for the link.
I took a quick look at the RFC and implemenation. The doc says it can be used as function parameters or arguments, but I didn’t find how it handles the calling conversion. I assume it is always passed by pointer. In that case, it equals to prohibit types (especially non ABI types) passing by value, which meets my above assumption.

I guess the opaque types are an inherently LLVM IR concept. How you do calling conventions is up to you resp. your target in the backend. It is just a vehicle for custom types in LLVM IR.

Then it won’t solve the compatibility problem I mentioned above. Target always has its implementation defined behavior when lowering arguments. However, the behavior is UB if the type ABI hasn’t been settled down.
The half type lowering on X86 is a typical case that implementation ahead of ABI defination and conflicts with ABI. Compiler Explorer
When a type is passed by value, the passing register must be determined. However, FEs cannot rely on such implementation defined behavior, because it might be changed when ABI settled down. So FEs must take care of no ABI defination types.
Unfortunately, other FEs usually assume all types are defined in Clang’s ABI which make them inevitably suffer the backward compatibility lost when they foretaste new types.

I totally agree with all you said. The issue is that Clang is the reference for ABIs. One of my favourite results of this state: Wrong cast of u16 to usize on aarch64 · Issue #97463 · rust-lang/rust · GitHub