>

Set excess_precision_type to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 to

round after each operation could keep semantics right.

And I'll document the behavior difference between soft-fp and

AVX512FP16 instruction for exceptions.

I got some feedback from my colleague who's working on supporting

_Float16 for llvm.

The LLVM side wants to set FLT_EVAL_METHOD_PROMOTE_TO_FLOAT for

soft-fp so that codes can be more efficient.

i.e.

_Float16 a, b, c, d;

d = a + b + c;

would be transformed to

float tmp, tmp1, a1, b1, c1;

a1 = (float) a;

b1 = (float) b;

c1 = (float) c;

tmp = a1 + b1;

tmp1 = tmp + c1;

d = (_Float16) tmp;

so there's only 1 truncation in the end.

if users want to round back after every operation. codes should be

explicitly written as

_Float16 a, b, c, d, e;

e = a + b;

d = e + c;

That's what Clang does, quote from [1]

_Float16 arithmetic will be performed using native half-precision

support when available on the target (e.g. on ARMv8.2a); otherwise it

will be performed at a higher precision (currently always float) and

then truncated down to _Float16. Note that C and C++ allow

intermediate floating-point operands of an expression to be computed

with greater precision than is expressible in their type, so Clang may

avoid intermediate truncations in certain cases; this may lead to

results that are inconsistent with native arithmetic.

and so does arm gcc

quote from arm.c

/* We can calculate either in 16-bit range and precision or

32-bit range and precision. Make that decision based on whether

we have native support for the ARMv8.2-A 16-bit floating-point

instructions or not. */

return (TARGET_VFP_FP16INST

? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16

: FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);