Changing semantics of __fp16

In Arm we are considering/discussing changing the semantics of storage-only type
__fp16 and we are looking for feedback on this. The motivation is that in A-profile,
architecture extension FP16 natively supports half-precision arithmetic. It is
also supported by SVE, and in M-profile MVE optionally supports it.

The problem is that float16_t is defined in the Arm C-Language Extensions
(ACLE) specification [1] as an alias for __fp16. Thus, using the float16_t /
__fp16 storage-only type which performs arithmetic in single-precision, we are
not taking advantage of the native half-precision FP16 instructions.

One obvious solution is to change the float16_t typedef in the ACLE from this:

typedef __fp16 float16_t;

to use _Float16 instead of __fp16, where _Float16 is the type with
half-precision arithmetic semantics. An alternative is to change the semantics
of __fp16, and both approaches have their pros and cons:

Changing the semantics of __fp16 (approach A):

  • Pros:
    – There is no ABI break.
    – Code that uses __fp16 also benefits from the more optimal implementation.
  • Cons:
    – No type would retain the old __fp16 semantics.
    – We’d need to change the compiler frontends (both Clang and GCC).
    – Existing code could rely on current __fp16 behaviour.

Keeping the semantics of __fp16 (approach B):

  • Pros:
    – People who want the old behaviour can use __fp16 directly.
    – We only need to change a typedef in a header file.
  • Cons:
    – Changing float16_t requires an ABI break.
    – Code that directly uses __fp16 would not benefit from the new float16_t optimisation.

Deciding for one of these approaches is difficult as people may get
happy/unhappy either way and it is difficult to quantify this, which is why we
welcome any feedback on this from e.g. users of __fp16. If for example the
opinion is that breaking the ABI is a last resort, then that would point into the
direction of Approach A and changing the semantics of __fp16.


Hi Sjoerd,

our downstream target also supports the __fp16 type as a pure storage type. The hardware offers instructions to convert between half-precision and single-precision, but no arithmetic.

Do you propose to eliminate the storage-only type altogether, or just renaming it (in your approach A)?


Hi —

Silently changing the semantics of __fp16 (Approach A) so that existing programs suddenly get different (and in many cases, worse) results seems quite problematic to me.

Approach B is _also_ deeply flawed, however, as it violates the ACLE documentation for float16_t:

If the __fp16 type is defined, float16_t is defined as an alias for it.

I have a slight preference for approach B, because it only effects people using float16_t, and not users of __fp16, and allows them the escape valve of switching to using __fp16 explicitly if they need the old behavior.

– Steve

Hello Steve and Konstantin,

Many thanks for replies. This is exactly the kind of feedback I was hoping for.

I personally fully agree with all your comments Steve. You’re also absolutely right of course that this violates the ACLE, so changing that is the implication for choosing this Approach B. So, with this support, I will start progressing this ACLE spec change which is the first thing to do, I think.



Thanks for the feedback. I think there are really two independent aspects to the proposal:

(1) Which 16-bit float types should change behaviour?
(2) When should they change behaviour?

For (1) the choices are:

(1a) Change float16_t and __fp16 [Approach A in Sjoerd’s email]
(1b) Change float16_t only [Approach B]

For (2) the choices can be divided up as:

(2a) Make the new behaviour opt-in
(2b) Make the new behaviour opt-out
(2c) Make the new behaviour unconditional

That gives 6 combinations in total. I think the feedback so far is that (1b) + (2c) is preferable to (1a) + (2c), which is just the kind of thing we were hoping for from this thread, thanks. In addition to that, do you have any thoughts about the other 4 combinations?

Personally I was wary of (1b) because, as you say, it changes a typedef provided by the ACLE. This is an ABI break, in the sense that it changes the mangling of a C++ function (F) that takes float16_t arguments. Definitions of F compiled before the change won’t link against uses of F compiled after the change, and definitions of F compiled after the change won’t link against uses of F compiled before the change.

From that point of view, (2a) seems like the most conservative option: existing binaries remain link-compatible with new binaries, and existing float16_t and __fp16 code keep their current behaviour. Other targets that have adopted __fp16 remain unaffected.

(1a) + (2b) is less conservative: existing binaries remain link-compatible with new binaries, but projects that rely on the existing float16_t and __fp16 behaviour would need to be adjusted for newer compilers. Perhaps there’s a weak analogy here with bumping the default -std= option.

This is only a high-level classification. The opt-ins or opt-outs could be controlled in various ways. Possibilities include:

  • a new command-line option
  • a new pragma (for (1a))
  • a special macro that needs to be defined before including the ACLE header files (for (1b))

The defaults could depend on the target, if that seemed preferable.

When compiling a function F that uses float16_t (perhaps internally), the float16_t semantics effectively become a property of F’s definition. Mixing a definition of F with the old float16_t semantics and a definition of F with new float16_t semantics would break the One Definition Rule. (2a) and (2b) make it easier to avoid ODR violations, (2c) would effectively force the user to recompile affected .os.

Thanks, and sorry for the long email.