TLDR: We propose modifying LLT by introducing separate integer and floating point kinds to enable non-IEEE floating point types for instruction selection and register bank selection.
Background:
GlobalISel is currently unable to represent the multitude of different floating point types available on modern hardware like BF16, TF32, and FP8 (E5M2 / E4M3).
GlobalISel uses LLTs to represent type information of virtual registers. LLTs only capture the kind, e.g. scalar, pointer, or vector, and the size or shape during IR translation from LLVM IR to gMIR.
Information about the concrete floating point type is inferred from the operation used and the size of the operands.
For example, a G_FMUL on two S32 values implies that the operands must be 32-bit IEEE floating point numbers. This way of deducing the actual type fails, however, with the advent of new floating point types like BF16, for example.
Information about the concrete floating point type is both relevant for register bank selection and instruction selection. Restoring this information is costly and often lossy. On AArch64 for example, this lost information is restored by walking uses/defs in order to guess which virtual register may hold a floating point value.
There is currently no way to restore the actual floating point type. Whenever the IRTranslator reaches a BF16 value in LLVM IR, we just bail out. [GlobalISel] Fall back for bf16 conversions. by aemerson · Pull Request #71470 · llvm/llvm-project · GitHub
Proposed changes:
This proposal is loosely based on @bogner’s original RFC with some slight modifications, mainly to ease adoption:
- Replace
IsPointer,IsVectorandIsScalarwith a new kind enumeration that allows for 4 new kinds:FLOAT,INTEGER,VECTOR_FLOAT,VECTOR_INTEGER. - Keep
SCALARandVECTOR_SCALARkinds for incremental adoption - Add Pass that drops integer and float kinds back to scalar kinds for incremental adoption.
- Type conversion between scalar and float / integer is not legal. Passes should either only use scalar kinds or only use integer / float kinds.
- Conversion between integer and float kinds requires
G_BITCAST.
We propose reusing the 3 bits currently used to encode the kind of LLT (IsScalar, IsPointer, IsVector) more efficiently to encode a total of 8 different LLT kinds: POINTER, INTEGER, FLOAT, SCALAR, VECTOR_POINTER, VECTOR_INTEGER, VECTOR_FLOAT, VECTOR_SCALAR. We can additionally use the fact that RawData will never be zero for the above kinds as an extra 4th bit to encode some additional kinds like INVALID, TOMBSTONE, EMPTY, and TOKEN.
enum class Kind : uint64_t {
POINTER = 0b000,
INTEGER = 0b001,
FLOAT = 0b010,
SCALAR = 0b011,
VECTOR_POINTER = 0b100,
VECTOR_INTEGER = 0b101,
VECTOR_FLOAT = 0b110,
VECTOR_SCALAR = 0b111,
};
We add a 2-bit field to LLTs with the VECTOR_FLOAT or FLOAT kinds which will be used to indicate the type of floating-point number. We do not aim to exactly represent floating-point semantics, which is why we decided to just use 2 bits to represent IEEE floats and 3 other floating-point variants. Each backend may choose how to map scalar sizes together with the floating-point info to actual floating-point types. This design could be simplified by making this mapping global at the expense of some flexibility / number of total FP types we can represent.
enum class FPInfo {
IEEE_FLOAT = 0x0,
VARIANT_FLOAT_1 = 0x1,
VARIANT_FLOAT_2 = 0x2,
VARIANT_FLOAT_3 = 0x3,
};
We aim for incremental adoption of these new LLT kinds, which can be toggled by a command-line option at runtime. The SCALAR and VECTOR_SCALAR kinds remain for compatibility and are only to be used by backends/passes that have not yet enabled floating-point information. All other backends should use INTEGER or FLOAT instead of SCALAR and VECTOR_INTEGER or VECTOR_FLOAT instead of VECTOR_SCALAR.
To ease incremental adoption, we would like to first convert single passes to use FPInfo inside of tests only. Later on, we aim to integrate everything by enabling FPInfo for a range of passes beginning with the IRTranslator by introducing a pass that would drop types from integer / float back to just scalar.
To ensure consistency in the IR and to allow register bank selection to easily determine where to insert G_COPY instructions between different register banks, we require G_BITCAST instructions whenever an integer LLT is used in a floating-point instruction or vice versa.
Patch: gist:9902f652792ea26ce15aa46c6692fce7 · GitHub
The above patch details changes to LLT. If we can agree on a path forward in this RFC we will follow up with a PR for AMDGPU using the new LLT kinds.
Previous RFCs: