RFC: [C++23] P1467R9 - Extended floating-point types and standard names

I have implemented support for extended floating-points from the C++23 standard in Clang and would like to discuss the changes for upstreaming. As this is my first time contributing to Clang and working with language proposals, please bear with me if there are any mistakes. I will promptly address your feedback and iterate on the changes.

Motivation

Clang currently supports three standard floating points i.e float, double and long double. Every other floating point that exists such as half, _Float16, __bf16, __float128, __ibm128 is a compiler/language extension and/or a storage only type such as __bf16. The real motivation for this change comes from extending __bf16 type in clang to also be arithmetic type that conforms to C++23 language standard from the accepted proposal P1467R9 - Extended floating-point types and standard names.

Core language changes

This change is specific to C++23 but backward compatibility will be maintained. The initial patch will implement the core language change of the proposal and the subsequent patch will implement the C++ standard library changes that will involve mostly defining overloads for new extended floating point types and creating std typedefs for extended floating points implementations in the compiler (i.e std::float16_t, std::bfloat16_t).

Below is the distilled version of the proposal with the proposed clang code changes. All of the core language changes only apply to interaction between floating point types as defined by this proposal (i.e __ibm128 doesn’t count, so does __float128 initially) AND are guarded by CPlusPlus2b language option in the compiler, i.e they strictly apply to C++23 and beyond.

1. Literal suffixes

Add suffixes for extended floating type literals. By default floating-point constants are of type double and this proposal does not allow implicit cast on extended floating points that will result in lossy conversions. Hence this change becomes relevant to allow something like example: std::bfloat16_t value = 1.0bf16;

  • Suffix for __bf16 will be added as bf16 and BF16.
  • Suffix for _Float16 already exists as fp16 and FP16.
  • Suffix for __float128 needs to be added as fp128 and FP128 since the current support uses the q/Q suffix which will be retained for backward compatibility.

Proposed code changes:

  • Add bool isBFloat16 : 1; to NumericLiteralParser class in /llvm-project/clang/include/clang/Lex/LiteralSupport.h

  • Add relevant code to parse the suffix in NumericLiteralParser::NumericLiteralParser function in /llvm-project/clang/lib/Lex/LiteralSupport.cpp

  • Handle Literal.isBFloat16 in ExprResult Sema::ActOnNumericConstant in /llvm-project/clang/lib/Sema/SemaExpr.cpp and set the type to Ty = Context.BFloat16Ty

2. Extended floating-point types

The below types are collectively known as floating point types in the context of this proposal.

  • Standard floating point types: float, double and long double.

  • Extended floating point types: These are floating point types in addition to the above standard floating-point types that conform to ISO/IEC/IEEE 60559 standard. Currently in the clang codebase we have _Float16, __bf16 and __float128 that conform to this standard. The initial patch will mark _Float16 and __bf16 as “extended floating-point” type and incrementally add support for __float128 in the subsequent patches.

The proposal applies to interaction amongst these types and the restrictions from the proposal don’t apply when “floating point types” interact with other types that may be compiler/language extensions, i.e “Any additional implementation-specific types representing floating-point values that are not defined by the implementation to be extended floating-point types are not considered to be floating-point types, and this document imposes no requirements on them or their interactions with floating-point types.”

Proposed code changes:

  • Add utility function in /llvm-project/clang/lib/AST/Type.cpp to determine if a given Type is a CXX23 Floating Point type which can be used to determine when to apply the rules of the proposal. Return true for float, double, long double, _Float16 and __bf16 and a later patch can include __float128.

3. Conversion rank

This is used to decide if a given type can represent all of the values of another type without any loss of precision, in other words it is used to order types, i.e if a type T1 can represent exactly all of the values of another type T2 then its conversion rank is at least equal to that of T2 but if the reverse is not true its conversion rank is greater than T2.

This proposal introduces two new concepts in conversion ranks:

  • Unordered types: “The conversion ranks of floating-point types T1 and T2 are unordered if the set of values of T1 is neither a subset nor a superset of the set of values of T2”, i.e std::float16_t and std::bfloat16_t are unordered, float16 has a better precision but smaller range compared to bf16, however both types cannot exactly represent every value from the other type. This is really done to prevent lossy conversions.

  • Subranks: It is possible two types T1 and T2 may have the same rank, i.e they can both exactly represent the same set of values, i.e float and std::float32_t, in this case both types will have equal conversion rank but they will be further ordered by their sub-ranks where extended floating point type will have a higher sub-rank than the standard type, this is used in the context of overload resolution to break ties between conversion sequence with equal conversion ranks and usual arithmetic conversions where we have a binary operations with different floating point types and we need to convert one of the operands to a common type which is also the type of the result.

Currently in clang codebase all floating types are ordered and represented as an enum FloatingRank, hence when comparing conversion ranks of any two floating point types there are only 3 possibilities, however with this proposal when comparing conversion ranks we have 6 possibilities: Unordered, Smaller, Larger, Equal, Equal but smaller sub-rank and Equal but larger sub-rank.

Proposed code changes:

  • Define the below enum FloatingRankCompareResult to represent the new conversion rank results in /llvm-project/clang/include/clang/AST/ASTContext.h:

    enum FloatingRankCompareResult {
        FRCR_Unordered,
        FRCR_Smaller,
        FRCR_Larger,
        FRCR_Equal,
        FRCR_Equal_Smaller_Subrank,
        FRCR_Equal_Larger_Subrank,
    };
    
  • Routines used for comparing conversion ranks such as getFloatingTypeOrder and getFloatingTypeSemanticOrder will now return FloatingRankCompareResult instead of an int type, which also can also be used for backward compatibility, i,e -1 will map to FRCR_Smaller, 0 will map to FRCR_Equal and > 1 will map to FRCR_Larger.

  • The comparison of floating-point conversion ranks can be efficiently represented using a statically defined map for quick look-up, as follows:

      using RankMap = std::unordered_map<clang::BuiltinType::Kind, FloatingRankCompareResult>;
    
      std::unordered_map<clang::BuiltinType::Kind, RankMap>
          CXX23FloatingPointConversionRankMap = {
              ...
              ...
              {clang::BuiltinType::BFloat16,
              {
                  {clang::BuiltinType::Float16,
                  FloatingRankCompareResult::FRCR_Unordered},
                  {clang::BuiltinType::BFloat16,
                  FloatingRankCompareResult::FRCR_Equal},
                  {clang::BuiltinType::Float,
                  FloatingRankCompareResult::FRCR_Smaller},
                  {clang::BuiltinType::Double,
                  FloatingRankCompareResult::FRCR_Smaller},
                  {clang::BuiltinType::LongDouble,
                  FloatingRankCompareResult::FRCR_Smaller},
                  {clang::BuiltinType::Float128,
                  FloatingRankCompareResult::FRCR_Smaller},
              }},
              ...
              ...
          };
    

4. Implicit conversions

Any implicit conversion involving extended floating point types must be lossless, i.e LHS should have a higher or equal conversion rank when compared with RHS, however to maintain backward compatibility implicit conversions amongst standard floating types do not have this restriction. Hence std::bfloat16_t = 1.0; will be illegal but float f = 1.0 is legal.

Explicit conversions do not have any restrictions and if the value after conversion is between two adjacent values in the destination then the result can be defined by the implementation, i.e it can pick either of the values otherwise the result is undefined.

Proposed code changes:

  • Add checks for conversion rank restriction in ExprResult Sema::ImpCastExprToType in /llvm-project/clang/lib/Sema/Sema.cpp

  • Disable standard conversion for the case where both types are floating point numbers but LHS conversion rank is not greater or equal compared to RHS in IsStandardConversion routine in /llvm-project/clang/lib/Sema/SemaOverload.cpp.

Rules for type promotions are unchanged for extended floating point types and will remain the same for backward compatibility, i.e float to double or double to long double implicit conversion is considered as a promotion which is used in function overload resolution to rank candidates for best match.

5. Usual arithmetic conversions

These are described precisely here, but the crux is if conversion rank of the operands is unordered then the expression is ill-formed, otherwise between floating point types the conversion happens to the type with smaller conversion to be the same as the type with greater conversion rank however if both types have equal conversion ranks then we break the tie with the type with higher sub-conversion rank as described in earlier.

Proposed code changes:

  • Handle unordered case in handleFloatConversion routine in /llvm-project/clang/lib/Sema/SemaExpr.cpp and expand greater rank case to also handle case where ranks are equal but sub-rank is higher, similarly expand smaller rank case to also handle the case where ranks are equal but sub-rank is smaller.

  • Modify bool Type::isArithmeticType() const to be bool Type::isArithmeticType(const ASTContext &Ctx) const in /llvm-project/clang/lib/AST/Type.cpp so that we can return true for Bfloat16 as being arithmetic type if CPlusPlus2b is set to true in language options.

6. Narrowing conversion

This means converting type T1 to type T2 whose conversion rank is smaller than T1. While definition has been changed to be in terms of conversion ranks (instead of long double to float/double or double to float) but it has no practical implication when extended floating point types are involved with standard floating point types as implicit conversion rules described above apply in such scenarios.

7. Overload resolution

Prefer resolutions that are value preserving and prefer conversions with the equal ranks where there are multiple candidates. In the case of conversions sequence candidates with both having equal ranks the tie is broken using sub-ranks, i.e A higher sub-rank is given preference, but if one conversion sequence has equal conversion rank then it is preferred over conversion sequence that involve not a floating point type or floating type of not equal rank. In cases where there are candidates that might allow lossless conversions but none of them are an exact match or equal rank then the resolution becomes ambiguous, example:

void f(std::float32_t)
void f(std::float64_t)

f(std::float16_t(1.0)); // ambiguous

Proposed code changes:

  • This tie breaker in the case of equal conversion ranks in conversion sequence and in general ranking conversion sequence candidates as per the rules described here can be implemented in static ImplicitConversionSequence::CompareKind CompareStandardConversionSequences routine in /llvm-project/clang/lib/Sema/SemaOverload.cpp where we insert this rule right after [over.ics.rank]p4b2 as described here. ImplicitConversionSequence is used to represent a conversion sequence and we look for Second conversion with-in to see if it is a floating point type and if so we compare the conversion ranks of the two types.

8. Predefined macros

This is used to defined macros that indicate a specific extended floating point is implemented by the compiler as per this standard. These macros are used by C++ std libraries to define typedef of extended floating types such as std::float16_t, std::bfloat16_t, etc.

Proposed code changes:

  • In static void InitializeStandardPredefinedMacros routine in /llvm-project/clang/lib/Frontend/InitPreprocessor.cpp expand on the following:
    if (LangOpts.CPlusPlus2b) {
        ....
        Builder.defineMacro("__STDCPP_FLOAT16_T__", "1");
        Builder.defineMacro("__STDCPP_BFLOAT16_T__", "1");
    }
    

Open questions:

In my local change I was able to leverage storage only __bf16 type to be treated as arithmetic type if CPlusPlus2b language option was true and didn’t see a need to create a dedicated arithmetic type like it was done in the case _Float16 even though half existed back then. However I am not sure if this is the right approach and if there is a need to create a dedicated arithmetic type for bfloat16 as well.

1 Like

It appears that most of the proposed features list is consistent with the standard, so I have no problems with those. It isn’t clear from your proposal what the implementation looks like, the ‘proposed code changes’ sections aren’t particularly helpful without context that wasn’t provided.

The biggest question for me, and hardest one to answer: how is this going to be represented in LLVM-IR? IR doesn’t support these types with the correct semantics, so we might find we need ‘runtime library’ support, which these would be broken without.

The second biggest question for me: Have you defined the ABI/mangling/etc at microsoft for these types? Have you worked with ItaniumABI group to get the ABI/mangling/etc for these settled?

Thank you for working on this!

The outlined approach looks generally good to me, but as detailed below, will need to be modified to avoid reuse of __bf16 and __float128.

Coordination with the Itanium ABI maintainers will be needed to identify the mangling to be used for std::bfloat16_t; The documentation does not currently specify a mangling, but there is a related pull request with important discussion. That discussion claims that the existing __bf16 and __float128 types are not suitable as the types for std::bfloat16_t and std::float128_t. There are several reasons that I won’t mention here; please go read that discussion, but one important one is that __float128 already has a mangling that differs from what should be used for std::float128_t (g as opposed to DF128_). Likewise, __bf16 is already mangled as as a vendor extended type (u6__bf16). It sounds like gcc has recently changed its __bf16 implementation to be a full arithmetic type depending on target. That doesn’t sound like a good approach to me and I would be hesitant to do likewise.

Per [basic.extended.fp]p7, the recommended guidance is that the std::floatN_t types correspond to the _FloatN types from C23. Clang does not yet have an implementation of all of those types, but I fully expect an implementation to materialize. I have a strong preference that the C++ types be mapped directly to the C23 types by the same name. It sounds like that is what libstdcxx will be doing.

namespace std {
  using float16_t = _Float16;
  using float32_t = _Float32;
  using float64_t = _Float64;
  using float128_t = _Float128;
}

Please add me as a reviewer when you start posting changes and I’ll do my best to provide timely feedback.

LLVM has a bf16 type, but the support varies by target and features. AArch64 has a f16 type, but the support is also depending on available features. Presumably, all the new types for C++ and C23 need support from compiler-rt.

I think float32_t and float64_t could just be forwarded to the float and double versions. Their representations should be equivalent on most platforms. That would also save on binary size.

For RISC-V, float(F), double(D), and f128(Q) are extensions. If you find a core without FDQ, you will need support from compiler-rt.

Depending on what you mean by “forwarded”, yes. If they have the same semantics, then the former can be implemented in terms of the latter, but distinct types are still required.

@tahonermann I mean that libc++ could just bit_cast to them, and for any magic the compiler needs to do, it can just generate a call to the float/double versions instead of calling a completely different function.

@tschuett The RISC-V target has to support for float and double, right? That means that these compiler-rt functions must already exist, so we can use them for float32_t and float64_t.

Thank you, @tahonermann. I’ve followed GCC’s approach by using __bf16 storage type for std::bfloat_16 and allowing arithmetic semantics as described in the proposal but I was not sure if that was a good idea. Referring to the mangling discussion, it appears we can implement a distinct bfloat16 type, similar to _Float16 for __fp16 /half in this patch. This would enable defining Itanium mangling as df16b. If this approach is acceptable, I’ll submit a revised patch. Please let me know your thoughts.

@erichkeane, it seems that bf16 can be largely represented in LLVM-IR. Would you mind providing examples to explain the possible requirement for ‘runtime library’ support owing to improper semantics in IR? Although we’re not focused on Microsoft or Itanium mangling, we’re eager to collaborate with the community and contribute where possible. If addressing mangling is essential, please help us understand the reasoning, and we’d be open to discussing it further.

@erichkeane, it seems that bf16 can be largely represented in LLVM-IR. Would you mind providing examples to explain the possible requirement for ‘runtime library’ support owing to improper semantics in IR? Although we’re not focused on Microsoft or Itanium mangling, we’re eager to collaborate with the community and contribute where possible. If addressing mangling is essential, please help us understand the reasoning, and we’d be open to discussing it further.

_bf16/_fp16 are defined as storage only formats, meaning you cannot do math to them directly. I presume any type that we’d define we’d like to avoid the semantics of ‘upcast and truncate’ that would come with them. I have similar thoughts on the 128 bit float support. The way this needs to then be implemented is in the support library. We had a similar issue with the _BitInt > 128 for division.

Thank you for the explanation, it makes sense. Just to confirm, if the target natively supports bfloat16 operations, including LLVM-IR semantics such as Fptrunc and Fpext, then we shouldn’t face any issues, right? If so, would it be reasonable to address runtime support in a separate patch?

I understand the challenge of converting between Float16 and BFloat16, as neither LLVM-IR’s FpTrunc nor FpExt applies in this situation. In my current implementation, I raise an unsupported error if the user attempts to explicitly convert between these unordered types in C++. I plan to address this issue in a subsequent patch, which will likely require its own discussion. Please let me know your thoughts on this approach.

I appreciate your valuable input and look forward to working collaboratively with the community.

It would be my preference to not enable the new types on any platform that we cannot fully support it. So if we can natively support it, I have no problem enabling support, but we’d have to disable it in the frontend on any platform we couldn’t fully support.

1 Like

P1467R9 seems to say implementations may support the extended floating-point types. It seems to be completely up to Clang to decide which types and when it supports them. It may even change over time. It just needs to be documented.

1 Like

I don’t see how libc++ can do that since the types named by std::floatN_t must be extended integer types known to the compiler; libc++ can’t implement them in a class that calls std::bit_cast. If you mean that the compiler can use the same intrinsic functions for distinct types that have the same semantics, yes (though different names for the intrinsics might be desirable).

(tangent: I’m really annoyed that WG14 and WG21 didn’t agree on names for these extended integer types. I find it ridiculous that C headers that use these types and that are intended to also be used by C++ code will have to do something like the following.

#if defined(__cplusplus)
typedef std::float32_t my_float32_t;
#else
typedef _Float32_t my_float32_t;
#endif
my_float32_t f();

For this reason, I strongly believe Clang should expose the C names of the types when compiling for C++ and that libc++ should do something like:

using float32_t = _Float32;

rather than:

using float32_t = decltype(1.0f16);

)

That sounds good to me assuming that the proposed DF16b mangling gets finalized (that pull request hasn’t been merged yet).

That’s right:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/std/stdfloat;h=c39dbb64904d6d653e86aa4247dd2776a35d9d9c;hb=HEAD

bfloat16_t is defined in terms of decltype(0.0bf16):
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/include/bits/c%2B%2Bconfig;h=13892787e095c9810a616399658d0d98de6338e9;hb=HEAD#l823

It’s done that way because there is no C23 type for that type, and because we don’t want to assume that targets which support bfloat16_t as an arithmetic type will always define the __bf16 name for that type (e.g. they might use __bf16 for a storage-only type that is not the same as bfloat16_t, or they might support the arithmetic floating-point type but not define the __bf16 name at all).

1 Like

libc++ has to provide overloads for e.g. sin- That could be done this way:

inline _LIBCPP_HIDE_FROM_ABI float32_t sin(float32_t __x) noexcept {
  return std::bit_cast<float32_t>(__builtin_sin(std::bit_cast<float>(__x)));
}

That avoids having to add another ~140 builtins.

IMO libc++ should do using float16_t = decltype(0.f16) or something similar. That is more portable and can’t be wrong. For the implementation to be compliant decltype(0.f16) has to name the same type as std::float16_t. If that happens to be _Float16 that’s just fine by me, but that’s for the compiler to decide, not the standard library.

Ah, yes, that makes sense for library functions (when __STDCPP_FLOAT32_T__ and __STDC_IEC_60559_BFP__ are appropriately defined).

Dear LLVM community,

I have recently submitted a patch implementing the C++23 feature P1467R9 in Clang and would appreciate your feedback. As this is my first Clang contribution and C++ language proposal implementation, I humbly request your guidance in ensuring that the patch meets the community’s standards.

I am eager to collaborate and iterate on any changes or improvements you may suggest. Please find the patch in the Phabricator review link below:

https://reviews.llvm.org/D149573

Thank you for your time and consideration.

@tahonermann and @erichkeane,

In light of the ongoing discussion regarding Itanium mangling, I have implemented it in my recent patch for P1467R9 and guarded it with an experimental flag. This approach ensures users are aware that the mangling may change in the future. However, I am open to removing it entirely if you believe it’s best. Please note that doing so would leave us without a codegen test.

I would appreciate your thoughts on this matter. Thank you very much for your time and expertise.

Sorry for the late call (hadn’t noticed the discussion), but I’d like to come back to the question of adding another type to support arithmetic bfloat16.

That discussion claims that the existing __bf16 and __float128 types are not suitable as the types for std::bfloat16_t and std::float128_t . There are several reasons that I won’t mention here; please go read that discussion,

Reading that discussion back again, the conclusion I took home is that for __bf16 the only reason it wouldn’t be suitable is that it’s storage-only (link to relevant comment. What I concluded from the discussion on ⚙ D136919 [X86][RFC] Change mangle name of __bf16 from u6__bf16 to DF16b, is that representatives from the different architectures were comfortable with letting go of storage-only for __bf16, so that would then not not be an issue.

At Arm we opted for making __bf16 a storage-only type because we didn’t have a direct need for an arithmetic type. But our current thinking is that we would rather change the semantic to it being an arithmetic type than to have to support two types for historic reasons. It would also be nice not to deviate from GCC.

Do you mean ARM GCC is using __bf16 as an arithmetic type now? How about the mangling? It looks to me you want to keep the old name mangling from the last comments in D136919.
For X86 GCC, __bf16 is an arithmetic type with new mangle name DF16b. We still expect Clang behaves the same as GCC for X86.
So if we can make an agreement on the new mangling, I don’t see much value to introduce a new type.