I have implemented support for extended floating-points from the C++23 standard in Clang and would like to discuss the changes for upstreaming. As this is my first time contributing to Clang and working with language proposals, please bear with me if there are any mistakes. I will promptly address your feedback and iterate on the changes.
Motivation
Clang currently supports three standard floating points i.e float
, double
and long double
. Every other floating point that exists such as half
, _Float16
, __bf16
, __float128
, __ibm128
is a compiler/language extension and/or a storage only type such as __bf16
. The real motivation for this change comes from extending __bf16
type in clang to also be arithmetic type that conforms to C++23 language standard from the accepted proposal P1467R9 - Extended floating-point types and standard names.
Core language changes
This change is specific to C++23 but backward compatibility will be maintained. The initial patch will implement the core language change of the proposal and the subsequent patch will implement the C++ standard library changes that will involve mostly defining overloads for new extended floating point types and creating std
typedefs for extended floating points implementations in the compiler (i.e std::float16_t
, std::bfloat16_t
).
Below is the distilled version of the proposal with the proposed clang code changes. All of the core language changes only apply to interaction between floating point types as defined by this proposal (i.e __ibm128
doesnât count, so does __float128
initially) AND are guarded by CPlusPlus2b
language option in the compiler, i.e they strictly apply to C++23 and beyond.
1. Literal suffixes
Add suffixes for extended floating type literals. By default floating-point constants are of type double
and this proposal does not allow implicit cast on extended floating points that will result in lossy conversions. Hence this change becomes relevant to allow something like example: std::bfloat16_t value = 1.0bf16;
- Suffix for
__bf16
will be added asbf16
andBF16
. - Suffix for
_Float16
already exists asfp16
andFP16
. - Suffix for
__float128
needs to be added asfp128
andFP128
since the current support uses theq/Q
suffix which will be retained for backward compatibility.
Proposed code changes:
-
Add
bool isBFloat16 : 1;
toNumericLiteralParser
class in/llvm-project/clang/include/clang/Lex/LiteralSupport.h
-
Add relevant code to parse the suffix in
NumericLiteralParser::NumericLiteralParser
function in/llvm-project/clang/lib/Lex/LiteralSupport.cpp
-
Handle
Literal.isBFloat16
inExprResult Sema::ActOnNumericConstant
in/llvm-project/clang/lib/Sema/SemaExpr.cpp
and set the type toTy = Context.BFloat16Ty
2. Extended floating-point types
The below types are collectively known as floating point types in the context of this proposal.
-
Standard floating point types:
float
,double
andlong double
. -
Extended floating point types: These are floating point types in addition to the above standard floating-point types that conform to
ISO/IEC/IEEE 60559
standard. Currently in the clang codebase we have_Float16
,__bf16
and__float128
that conform to this standard. The initial patch will mark_Float16
and__bf16
as âextended floating-pointâ type and incrementally add support for__float128
in the subsequent patches.
The proposal applies to interaction amongst these types and the restrictions from the proposal donât apply when âfloating point typesâ interact with other types that may be compiler/language extensions, i.e âAny additional implementation-specific types representing floating-point values that are not defined by the implementation to be extended floating-point types are not considered to be floating-point types, and this document imposes no requirements on them or their interactions with floating-point types.â
Proposed code changes:
- Add utility function in
/llvm-project/clang/lib/AST/Type.cpp
to determine if a given Type is a CXX23 Floating Point type which can be used to determine when to apply the rules of the proposal. Return true forfloat
,double
,long double
,_Float16
and__bf16
and a later patch can include__float128
.
3. Conversion rank
This is used to decide if a given type can represent all of the values of another type without any loss of precision, in other words it is used to order types, i.e if a type T1
can represent exactly all of the values of another type T2
then its conversion rank is at least equal to that of T2
but if the reverse is not true its conversion rank is greater than T2
.
This proposal introduces two new concepts in conversion ranks:
-
Unordered types: âThe conversion ranks of floating-point types
T1
andT2
are unordered if the set of values ofT1
is neither a subset nor a superset of the set of values ofT2
â, i.estd::float16_t
andstd::bfloat16_
t are unordered,float16
has a better precision but smaller range compared tobf16
, however both types cannot exactly represent every value from the other type. This is really done to prevent lossy conversions. -
Subranks: It is possible two types
T1
andT2
may have the same rank, i.e they can both exactly represent the same set of values, i.efloat
andstd::float32_t
, in this case both types will have equal conversion rank but they will be further ordered by their sub-ranks where extended floating point type will have a higher sub-rank than the standard type, this is used in the context of overload resolution to break ties between conversion sequence with equal conversion ranks and usual arithmetic conversions where we have a binary operations with different floating point types and we need to convert one of the operands to a common type which is also the type of the result.
Currently in clang codebase all floating types are ordered and represented as an enum FloatingRank
, hence when comparing conversion ranks of any two floating point types there are only 3 possibilities, however with this proposal when comparing conversion ranks we have 6 possibilities: Unordered, Smaller, Larger, Equal, Equal but smaller sub-rank and Equal but larger sub-rank.
Proposed code changes:
-
Define the below
enum FloatingRankCompareResult
to represent the new conversion rank results in/llvm-project/clang/include/clang/AST/ASTContext.h
:enum FloatingRankCompareResult { FRCR_Unordered, FRCR_Smaller, FRCR_Larger, FRCR_Equal, FRCR_Equal_Smaller_Subrank, FRCR_Equal_Larger_Subrank, };
-
Routines used for comparing conversion ranks such as
getFloatingTypeOrder
andgetFloatingTypeSemanticOrder
will now returnFloatingRankCompareResult
instead of anint
type, which also can also be used for backward compatibility, i,e-1
will map toFRCR_Smaller
,0
will map toFRCR_Equal
and> 1
will map toFRCR_Larger
. -
The comparison of floating-point conversion ranks can be efficiently represented using a statically defined map for quick look-up, as follows:
using RankMap = std::unordered_map<clang::BuiltinType::Kind, FloatingRankCompareResult>; std::unordered_map<clang::BuiltinType::Kind, RankMap> CXX23FloatingPointConversionRankMap = { ... ... {clang::BuiltinType::BFloat16, { {clang::BuiltinType::Float16, FloatingRankCompareResult::FRCR_Unordered}, {clang::BuiltinType::BFloat16, FloatingRankCompareResult::FRCR_Equal}, {clang::BuiltinType::Float, FloatingRankCompareResult::FRCR_Smaller}, {clang::BuiltinType::Double, FloatingRankCompareResult::FRCR_Smaller}, {clang::BuiltinType::LongDouble, FloatingRankCompareResult::FRCR_Smaller}, {clang::BuiltinType::Float128, FloatingRankCompareResult::FRCR_Smaller}, }}, ... ... };
4. Implicit conversions
Any implicit conversion involving extended floating point types must be lossless, i.e LHS should have a higher or equal conversion rank when compared with RHS, however to maintain backward compatibility implicit conversions amongst standard floating types do not have this restriction. Hence std::bfloat16_t = 1.0;
will be illegal but float f = 1.0
is legal.
Explicit conversions do not have any restrictions and if the value after conversion is between two adjacent values in the destination then the result can be defined by the implementation, i.e it can pick either of the values otherwise the result is undefined.
Proposed code changes:
-
Add checks for conversion rank restriction in
ExprResult Sema::ImpCastExprToType
in/llvm-project/clang/lib/Sema/Sema.cpp
-
Disable standard conversion for the case where both types are floating point numbers but
LHS
conversion rank is not greater or equal compared toRHS
inIsStandardConversion
routine in/llvm-project/clang/lib/Sema/SemaOverload.cpp
.
Rules for type promotions are unchanged for extended floating point types and will remain the same for backward compatibility, i.e float
to double
or double
to long double
implicit conversion is considered as a promotion which is used in function overload resolution to rank candidates for best match.
5. Usual arithmetic conversions
These are described precisely here, but the crux is if conversion rank of the operands is unordered then the expression is ill-formed, otherwise between floating point types the conversion happens to the type with smaller conversion to be the same as the type with greater conversion rank however if both types have equal conversion ranks then we break the tie with the type with higher sub-conversion rank as described in earlier.
Proposed code changes:
-
Handle unordered case in
handleFloatConversion
routine in/llvm-project/clang/lib/Sema/SemaExpr.cpp
and expand greater rank case to also handle case where ranks are equal but sub-rank is higher, similarly expand smaller rank case to also handle the case where ranks are equal but sub-rank is smaller. -
Modify
bool Type::isArithmeticType() const
to bebool Type::isArithmeticType(const ASTContext &Ctx)
const in/llvm-project/clang/lib/AST/Type.cpp
so that we can return true forBfloat16
as being arithmetic type ifCPlusPlus2b
is set to true in language options.
6. Narrowing conversion
This means converting type T1
to type T2
whose conversion rank is smaller than T1
. While definition has been changed to be in terms of conversion ranks (instead of long double
to float
/double
or double
to float
) but it has no practical implication when extended floating point types are involved with standard floating point types as implicit conversion rules described above apply in such scenarios.
7. Overload resolution
Prefer resolutions that are value preserving and prefer conversions with the equal ranks where there are multiple candidates. In the case of conversions sequence candidates with both having equal ranks the tie is broken using sub-ranks, i.e A higher sub-rank is given preference, but if one conversion sequence has equal conversion rank then it is preferred over conversion sequence that involve not a floating point type or floating type of not equal rank. In cases where there are candidates that might allow lossless conversions but none of them are an exact match or equal rank then the resolution becomes ambiguous, example:
void f(std::float32_t)
void f(std::float64_t)
f(std::float16_t(1.0)); // ambiguous
Proposed code changes:
- This tie breaker in the case of equal conversion ranks in conversion sequence and in general ranking conversion sequence candidates as per the rules described here can be implemented in
static ImplicitConversionSequence::CompareKind CompareStandardConversionSequences
routine in/llvm-project/clang/lib/Sema/SemaOverload.cpp
where we insert this rule right after[over.ics.rank]p4b2
as described here.ImplicitConversionSequence
is used to represent a conversion sequence and we look forSecond
conversion with-in to see if it is a floating point type and if so we compare the conversion ranks of the two types.
8. Predefined macros
This is used to defined macros that indicate a specific extended floating point is implemented by the compiler as per this standard. These macros are used by C++ std
libraries to define typedef of extended floating types such as std::float16_t
, std::bfloat16_t
, etc.
Proposed code changes:
- In
static void InitializeStandardPredefinedMacros
routine in/llvm-project/clang/lib/Frontend/InitPreprocessor.cpp
expand on the following:if (LangOpts.CPlusPlus2b) { .... Builder.defineMacro("__STDCPP_FLOAT16_T__", "1"); Builder.defineMacro("__STDCPP_BFLOAT16_T__", "1"); }
Open questions:
In my local change I was able to leverage storage only __bf16
type to be treated as arithmetic type if CPlusPlus2b
language option was true
and didnât see a need to create a dedicated arithmetic type like it was done in the case _Float16
even though half
existed back then. However I am not sure if this is the right approach and if there is a need to create a dedicated arithmetic type for bfloat16
as well.