Similar to _Float16 support, I’m curious if there is interest in full bfloat support that would do arithmetic in half-precision if the target supports it. I work with Henry and was planning on implementing it as a patch locally but would be interested in also exploring pushing it upstream if it makes sense to.

I think legalization should work for this (I do think you need to do this in f32, not f16 though)

Thanks but I’m not sure I follow, are you suggesting we do the arithmetic in fp32?

Yes, f32 has the same number of exponent bits

That is true, but a fp32 register will store half the numbers as bfloat16 register that occupies same memory. The hardware we are working on has native bfloat16 support for arithmetic so it doesn’t make sense to upcast to fp32 and do arithmetic in full precision.

Of course if it’s a legal operation, you can directly select it. In the case where you need to legalize, I think this needs to be done in f32.

imho bf16 arithmetic support is a good idea, it would round differently than f32 for some operations (fma, probably more) so imho just doing the arithmetic as f32 then converting isn’t sufficient.