RFC: [GlobalISel] Representing fp types in LLT

bogner · December 8, 2020, 7:02pm

For a while now there have been discussions here and there about
representing floating point types explicitly in LLT. Having this type
information explicitly makes RegBankSelect easier to implement, avoids
inefficiencies in disambiguating which operation is needed to lower
various operations, and is necessary for correctness in light of variant
floating point types like bfloat.

I believe there is general consensus that we need this, so this RFC is
about how we should go about modeling these types.

At the dev meeting, we discussed two approaches IIRC:

1. Explicitly encode the number of bits in the exponent and mantissa for
flexibility
2. Encode only the types we need (ie, model the same fp types as
llvm::Type does)

The idea with (1) was that this way we could easily handle IEEE-754
binary formats and variants like bfloat16 and tf32 without needing to
burden ISel with knowledge of specifics. The major drawback of this is
that it would struggle to represent types that don't fit the general
mold, like ppc-fp128 (which is two doubles) or the IEEE 754-2008 decimal
formats.

I think (2) is the safer way forward, and I propose that we can
implement it using only 2 extra bits in each of the scalar and vector
variants of LLT. This approach largely hinges on the fact that there are
very few interesting (that is, actively used) variants of floating point
for a given scalar size.

The two bits would be used like so:

enum FPInfo { NotFP = 0x0, FP = 0x1, Reserved = 0x2, VariantFP = 0x3 };

This uses one bit to say a scalar is floating point, which, when set
will imply that it's the "usual" format for the size, and a second bit
to say it's a variant for that size. There's space to handle two
different variants for a size in the `Reserved` case if we need it
later.

With this, we can model every floating point type that llvm IR does
today (half, bfloat16, single, double, x86-fp80, fp128, ppc-fp128), and
is easily extensible for things like fp256, tensorfloat, or IEEE-754
decimal floats should llvm need to support those explicitly in the
future.

Backends will need to be updated to handle these extra types. Notably,
the legalizer will need to be made more precise for operations like fadd
and isel will need to be aware of when any scalar will do or if the
distinction is important.

I've attached a patch for demonstration that adds fpscalar, bfloat, and
ppcf128 and implements printing, parsing, and conversion from llvm
IR. If folks are happy with this direction I'll start working on the
backend updates to get this fully working.

WDYT?

llt-float.patch (16.6 KB)

jayfoad · December 9, 2020, 10:28am

I strongly agree!

Jay.

dsanders · December 16, 2020, 2:59am

For a while now there have been discussions here and there about
representing floating point types explicitly in LLT. Having this type
information explicitly makes RegBankSelect easier to implement, avoids
inefficiencies in disambiguating which operation is needed to lower
various operations, and is necessary for correctness in light of variant
floating point types like bfloat.

I believe there is general consensus that we need this, so this RFC is
about how we should go about modeling these types.

At the dev meeting, we discussed two approaches IIRC:

Explicitly encode the number of bits in the exponent and mantissa for
flexibility

Encode only the types we need (ie, model the same fp types as
llvm::Type does)

The idea with (1) was that this way we could easily handle IEEE-754
binary formats and variants like bfloat16 and tf32 without needing to
burden ISel with knowledge of specifics. The major drawback of this is
that it would struggle to represent types that don’t fit the general
mold, like ppc-fp128 (which is two doubles) or the IEEE 754-2008 decimal
formats.

I think (2) is the safer way forward

I strongly agree!

Jay.

+1 but I don’t think the public API should reveal that (e.g. we shouldn’t have an isVariantFloat()). I think that should still lean towards isFloat(), isBFloat(), getMantissaSizeInBits(), etc. in the API so that we can avoid changing the API if we grow more interesting types in the future.

amara · December 16, 2020, 7:19pm

+Tim

Thanks for pushing this forward Justin.

For a while now there have been discussions here and there about
representing floating point types explicitly in LLT. Having this type
information explicitly makes RegBankSelect easier to implement, avoids
inefficiencies in disambiguating which operation is needed to lower
various operations, and is necessary for correctness in light of variant
floating point types like bfloat.

I believe there is general consensus that we need this, so this RFC is
about how we should go about modeling these types.

At the dev meeting, we discussed two approaches IIRC:

1. Explicitly encode the number of bits in the exponent and mantissa for
flexibility
2. Encode only the types we need (ie, model the same fp types as
llvm::Type does)

The idea with (1) was that this way we could easily handle IEEE-754
binary formats and variants like bfloat16 and tf32 without needing to
burden ISel with knowledge of specifics. The major drawback of this is
that it would struggle to represent types that don't fit the general
mold, like ppc-fp128 (which is two doubles) or the IEEE 754-2008 decimal
formats.

I think (2) is the safer way forward, and I propose that we can
implement it using only 2 extra bits in each of the scalar and vector
variants of LLT. This approach largely hinges on the fact that there are
very few interesting (that is, actively used) variants of floating point
for a given scalar size.

This argument seems reasonable to me.

The two bits would be used like so:
enum FPInfo { NotFP = 0x0, FP = 0x1, Reserved = 0x2, VariantFP = 0x3 };
This uses one bit to say a scalar is floating point, which, when set
will imply that it's the "usual" format for the size, and a second bit
to say it's a variant for that size. There's space to handle two
different variants for a size in the `Reserved` case if we need it
later.

With this, we can model every floating point type that llvm IR does
today (half, bfloat16, single, double, x86-fp80, fp128, ppc-fp128), and
is easily extensible for things like fp256, tensorfloat, or IEEE-754
decimal floats should llvm need to support those explicitly in the
future.

Backends will need to be updated to handle these extra types. Notably,
the legalizer will need to be made more precise for operations like fadd
and isel will need to be aware of when any scalar will do or if the
distinction is important.

I’d like to elaborate further here on what the impacts would on the GISel pipeline as a whole.

E.g. RegBankSelect. On targets like AArch64 with GPR and FPR banks, we currently use RBS to piece together int/fp information using the instruction’s surrounding context. With these types, this would be much simpler. Having said that, does this effectively render the entire notion of RegBankSelect redundant?

arsenm · December 17, 2020, 2:13am

Yes, I think encoding the FP mode would be overkill. I do think we shouldn’t go too far into treating these as FP types mirroring the IR though. I would still like to be able to directly operate on these with integer operations without intermediate bitcasts. Only lowerings which really care about the FP semantics would need to treat these differently.

I do think threading these through all the existing legalizer code could be painful, since quite a few places create out of context scalar types whereas now they would need to take care to pass through the FP-bits.

-Matt

Dominik_Montada · December 17, 2020, 7:51am

For a while now there have been discussions here and there about
representing floating point types explicitly in LLT. Having this type
information explicitly makes RegBankSelect easier to implement, avoids
inefficiencies in disambiguating which operation is needed to lower
various operations, and is necessary for correctness in light of variant
floating point types like bfloat.

I believe there is general consensus that we need this, so this RFC is
about how we should go about modeling these types.

At the dev meeting, we discussed two approaches IIRC:

1. Explicitly encode the number of bits in the exponent and mantissa for
flexibility
2. Encode only the types we need (ie, model the same fp types as
llvm::Type does)

Yes, I think encoding the FP mode would be overkill. I do think we shouldn’t go too far into treating these as FP types mirroring the IR though. I would still like to be able to directly operate on these with integer operations without intermediate bitcasts. Only lowerings which really care about the FP semantics would need to treat these differently.

Same here. Our target doesn't have any FPRs, so all floats are handled exactly the same as integers. So if possible I'd like to keep using the existing handling in places where I really don't care about floats.

Cheers,

Dominik

Topic		Replies	Views
[RFC][GlobalISel] Encoding type information into FP operations LLVM Project	13	564	November 23, 2023
[RFC][GlobalISel] Adding FP type information to LLT Common CodeGen Infrastructure globalisel	40	1017	September 11, 2025
RFC: [GlobalISel] propagating int/float type information LLVM Dev List Archives	16	284	June 29, 2020
tblgen multiclasses LLVM Dev List Archives	36	173	December 20, 2006
Does current LLVM target-independent code generator supports my strange chip? LLVM Dev List Archives	17	146	November 28, 2008

RFC: [GlobalISel] Representing fp types in LLT

Related topics