RFC: Support for Hexadecimal floating-point
This RFC concerns adding support for (IBM) hexadecimal floating-point, aka HexFloat, or HFP.
Some of the areas touched on here are also considered by the recent “RFC: Decimal floating-point support” (RFC: Decimal floating-point support (ISO/IEC TS 18661-2 and C23)).
What is HexFloat?
HexFloat is a format for encoding floating point numbers. It was first introduced with the IBM/360, and therefore predates the more contemporary and ubiquitous IEEE encoding… See here for an overview: IBM hexadecimal floating-point - Wikipedia of HexFloat.
The primary platform that supports HexFloat is the IBM System/Z running z/OS, and what follows is motivated by this platform. (HexFloat is not supported by Linux on IBM Z.)
User-visible impact and scope of changes
It is important to note that HexFloat affects only how values are represented; the programmer writes code using the standard float,
double, and
long double` types. Most changes, therefore, will be to the IR and the backend. Compilation units use either HexFloat or IEEE. Users can not mix .o’s compiled to use different modes in the same program.
The only action the programmer needs to take to use HexFloat is to specify a compiler flag to indicate that the standard linguistic types should be represented by HexFloat and appropriate code emitted. We propose the following flag:
-mzos-float-kind=<ieee|hex>
Again, it should be noted that all compile units in a program must be compiled with the same setting for this option, which should also be given in the link step.
This will be recognized only when the appropriate -target
/ -triple
is given.
If the -mzos-float-kind
option is not given, the appropriate default for the target will be used.
There are relatively few changes to the front-end, with most changes being restricted to the SystemZ target specific back end. Changes to the mid-levels are largely related to recognizing the new IR types (see below).
Pre-processor
The convention on z/OS is that when in IEEE mode, the __BFP__
(Binary Floating Point) macro is predefined by the compiler. There is no macro that directly indicates that HexFloat mode is active. Thus, to check that HexFloat is active, the idiom is to check that defined(__MVS__) && ! defined(__BFP__)
.
The values of the floating point limits defined by the compiler (e.g., the __DBL_MAX__
macro) are all updated to have the correct values for the floating point mode. The values in <float.h>
will also be updated.
The main uses of the macro are in system/library headers, which use __BFP__
to control which declarations should be seen by the compiler… For the most part, user code will rarely, if ever, need to use this macro.
The APFloat family
A new HexFloat subclass of AFPloatBase
will be introduced to APFloat
to represent HexFloat values. All the methods necessary to support compile-time evaluation of expressions (e.g., during constant folding) will be implemented. As such, a HexFloat will be useable anywhere an existing APFloat is useable.
New IR types
New types for HexFloat will be added to the IR set of types:
IR type | C type |
---|---|
hex_fp32 |
32-bit HexFloat, C type float |
hex_f64 |
64-bit HexFloat, C type double |
hex_fp128 |
128-bit HexFloat, C type long double |
IR Literals
Literals in HexFloat format in the IR will be speciifed by a new 0XS
prefix. The variant of HexFloat, i.e., whether the value is hex_fp32
, hex_fp64
etc., will be determined by the length of the literal. The literal is the hexadecimal representation of the value in HexFloat format, and therefore encodes the sign, (biased) exponent and significand in their respective positions.
Example C → IR
This example shows the main elements discussed so far. The C code is conventional. The IR shows the new IR types, and the encoding of the float literal 2.0f
. Note also that the standard fadd
operation is used. The back-end will lower this to correct instruction for HexFloat.
float plus2(float arg) {}
return arg + 2.0f;
}
define hidden hex_fp32 @plus2(hex_fp32 noundef %arg) {
entry:
%arg.addr = alloca hex_fp32, align 4
store hex_fp32 %arg, ptr %arg.addr, align 4
%0 = load hex_fp32, ptr %arg.addr, align 4
%add = fadd hex_fp32 %0, 0xS41200000
ret hex_fp32 %add
}
Compiler runtime
There are various routines in compiler-rt
which manipulate floating point values, and which are dependent on the format of the floating point value. Examples include conversion to/from integer values. New variants to work with HexFloat will be provided. In the vast majority of cases the new variants are just wrappers that compile the existing code, but under a new name.
C++
Mangling is unaffected as the C++ types are unchanged. As far as the user code is concerned, the types are the standard types (float, double
, etc.); the representational choice is independent of the language type encoded in the mangling. As a program must be entirely either HexFloat or IEEE, there is no need to encode the representational format.
Both HexFloat and IEEE will use the same typeinfo
objects, which, again, is not problematic because programs are entirely HexFloat or IEEE. RTTI and exception handling, therefore, will work seamlessly.
libcxx and libcxxabi
A separate instance of the C++ library (libcxx) will need to be built for HexFloat. Although not identical, this is not altogether unlike supporting multiple instances for 32-bit and 64-bit modes.
libcxxabi
will be shared between IEEE and HexFloat.
Some code in the library is sensitive to the floating format encoding. Where necessary, alternative implementations for HexFloat will be provided. Sensitive parts of the code may need to be guarded with the pre-processor macros described above so that the correct parts are included for the compilation units.
Examples of the types of changes that will be necessary include updating numeric_limits<>
to have the correct values for HexFloat. Similarly, std::format
will need to be modified to handle HexFloat. Again, it should be emphasized that in any one instance of the C++ library only one of the IEEE/HexFloats variants will be active.
DWARF
HexFloat values will tagged with the vendor specific type tag used in existing compilers.
The tags are:
0xde IBM_complex_float_hex
0xdf IBM_float_hex
As noted above, these tags are already in use by existing tools to describe HexFloat. Note also, that unlike with mangling, a debugger does need to know the representational format of the data.