__fp16 suport in llvm back-end

Hi, all:

I am trying to test half float point support in llvm, I found clang can generate bitcode for __fp16, while llc can't generate code for it, the error message is like this

LLVM ERROR: Cannot select: 0x26a68e0: i16 = fp32_to_fp16 0x26a67d8 [ORD=2] [ID=4]
  0x26a67d8: f32,ch = CopyFromReg 0x2693060, 0x26a66d0 [ORD=2] [ID=3]
    0x26a66d0: f32 = Register %vreg1 [ID=1]
In function: test

Anyone know what is the problem; I just suspect intrinsic " llvm.convert.to.fp16" is not well implemented?

Attaching my test:

#test.c clang -cc1 -O0 test.c -emit-llvm -o test.bc
typedef __fp16 half;
void test()
{
    half x = 0.1f;
    x += 2.0f;
    half y = x + x;
}

The generated bitcode:

llc -O0 test.bc

; ModuleID = 'test.c'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind
define void @test() #0 {
  %x = alloca i16, align 2
  %y = alloca i16, align 2
  %1 = call i16 @llvm.convert.to.fp16(float 0x3FB99999A0000000)
  store i16 %1, i16* %x, align 2
  %2 = load i16* %x, align 2
  %3 = call float @llvm.convert.from.fp16(i16 %2)
  %4 = fadd float %3, 2.000000e+00
  %5 = call i16 @llvm.convert.to.fp16(float %4)
  store i16 %5, i16* %x, align 2
  %6 = load i16* %x, align 2
  %7 = call float @llvm.convert.from.fp16(i16 %6)
  %8 = load i16* %x, align 2
  %9 = call float @llvm.convert.from.fp16(i16 %8)
  %10 = fadd float %7, %9
  %11 = call i16 @llvm.convert.to.fp16(float %10)
  store i16 %11, i16* %y, align 2
  ret void
}

; Function Attrs: nounwind readnone
declare i16 @llvm.convert.to.fp16(float) #1

; Function Attrs: nounwind readnone
declare float @llvm.convert.from.fp16(i16) #1

attributes #0 = { nounwind "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-realign-stack" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone }

!llvm.ident = !{!0}

!0 = metadata !{metadata !"clang version 3.5.0 (trunk 211249)"}

Thanks
Wan Xiaofei

Hi Wan,

Unfortunately, on x86 we don't have ISel tablegen patterns to select
'fp32_to_fp16' and fp16_to_fp32' dag nodes.
I actually already have a patch that I plan to send (hopefully by the
end of today) on the mailing list that implements the missing support
for those dag nodes and allows selecting vcvtph2ps/vcvtps2ph
instructions if the target supports F16C (I hope this is your case :slight_smile:
).

If your target doesn't support F16C then things are complicated.
We definitely shouldn't crash with a fatal error in the backend and
emit instead calls to the compiler runtime.
The compiler knows about the existence of libcalls like
'__gnu_f2h_ieee' and '__gnu_h2f_ieee2'. However, I am not sure if
those are really supported by our compiler runtime...

-Andrea

Thanks, Andrea.

I notice that currently for ARM, it use libcall '__gnu_f2h_ieee' and '__gnu_h2f_ieee2' for 'fp32_to_fp16' and fp16_to_fp32'. Are these two libcalls in glibc? I don't find them in c/c++ runtime libraries.
I just want to test fp16 support on common x86 platforms so I am not sure whether it supports F16C, based on my knowledge F16C is not supported on common x86 platform.

It is possible not to call compiler intrinsics to do 'fp32_to_fp16' and fp16_to_fp32' when generating BC? Call external functions instead, any runtime which wants to support fp16 should implement these two conversion functions.

Thanks
Wan Xiaofei

Thanks, Andrea.

I notice that currently for ARM, it use libcall '__gnu_f2h_ieee' and '__gnu_h2f_ieee2' for 'fp32_to_fp16' and fp16_to_fp32'. Are these two libcalls in glibc? I don't find them in c/c++ runtime libraries.
I just want to test fp16 support on common x86 platforms so I am not sure whether it supports F16C, based on my knowledge F16C is not supported on common x86 platform.

As far as I know, '__gnu_f2h_ieee' and '__gnu_h2f_ieee2' are not
available (at least on x86). Also, F16C is not supported on a generic
x86 platforms.

It is possible not to call compiler intrinsics to do 'fp32_to_fp16' and fp16_to_fp32' when generating BC? Call external functions instead, any runtime which wants to support fp16 should implement these two conversion functions.

It would be nice to have something like '__gnu_f2h_ieee' and
'__gnu_h2f_ieee2' available on x86 as well. That would make possible
to lower intrinsic calls to do half-float to float conversion (and
vice versa) into library calls. I think it is ok to have compiler
intrinsics for half float conversions.The problem in my opinion is
that the backend should emit calls to the runtime library if there is
no support for half/float conversions in hardware.

-Andrea