new bfloat IR type for C bfloat type mapping

Hi all,

At Arm we have started upstreaming support for various Armv8.6-a features [1]. As part of this effort we are upstreaming support for the Brain floating point format (bfloat16) C type [2].

As the name suggests it's a 16-bit floating point format, but with the same amount of exponent bits as an IEEE 754 float32, which are taken from the mantissa which is now only 7 bits. It also behaves much like an IEEE 754 type. For more info, see [3] and [4].

In our original patch [2], we mapped the C bfloat type to either float or int32 llvm type, just like we do for _Float16 and __fp16. However Craig Topper was quite surprised we would pass as half, and John McCall and JF Bastien suggested to map it to a new bfloat IR type, as again bfloat is distinct from float16, and doing so should be straightforward. Sjoerd Meijer also concurred, but suggested we poll the mailing list before forging ahead.

Our initial thought was that a separate IR type isn't needed. There's no support in the architecture for naked bfloat operations and bfloat would only be used in an ML context through intrinsics. But it is a separate type, and it does make sense to treat it as such. Also several architectures have or have announced support for bf16 and there are proposals in flight to add it to the C++ standard.

Thoughts?

Best regards,
/Ties Stuij

links:
[1] https://reviews.llvm.org/D76062
[2] https://reviews.llvm.org/D76077
[3] https://community.arm.com/developer/ip-products/processors/b/ml-ip-blog/posts/bfloat16-processing-for-neural-networks-on-armv8_2d00_a
[4] https://static.docs.arm.com/ddi0487/fa/DDI0487F_a_armv8_arm.pdf

If this is a storage-only type, why you need new IR type at all? Why
can't you simply use i16 everywhere?

If this is a storage-only type, why you need new IR type at all? Why

can't you simply use i16 everywhere?

John McCall had a succinct answer to that on the review:
'Calling something a "storage-only" type does not get you out of worrying about calling conventions at the ABI level. You can forbid passing your type as an argument or result directly, but structs containing your type as a field can still be passed around, and the behavior for that needs to be defined.'

To fill that in in a way, we're using half in some cases to make sure the bfloat goes to an FPR instead of a GPR.

Cheers,
/Ties

If I remember correctly, __fp16 is a “storage only” type and we don’t allow it as an argument unless some native half setting is enabled. Could we just not allow __bf16 as an argument and treat it as an i16 everywhere else? I think that matches what we do for __fp16 when half isn’t native.

If I remember correctly, __fp16 is a "storage only" type and we don't allow it as an argument unless some native half setting is enabled. Could we just not allow __bf16 as an argument and treat it as an i16 everywhere else? I think that matches what we do for __fp16 when half isn't native.

I'm not sure the solution addresses John McCall's general ABI argument, and if adding an IR type is indeed 'totally mechanical', I think it'd be nicer to not restrict bfloat and just have an explicit type.

Cheers,
/Ties

On some platforms designed for tasks of machine learning bfloat16 is full-fledged type, it is supported in the same extent as float.