Help!!!!Help!!!! " LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16] " problem!!!!!!!!!!!!!!!!!!

Hi all,
I am new to llvm. I need help. Thank you every!

I want to realize vcvtt.f16.f32 NEON instruction by llvm. This instruction covert top-16bits of a single type to f16. I use the intrinsics function llvm.convert.to.fp16, but cannot llc , I meet is following problem :

LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16]
0x9fc0750: f32,ch = load 0x3aafd68, 0x9fc2a20, 0x9feaab0<LD4[%sunkaddr85033]> [ORD=125117] [ID=15]
0x9fc2a20: i32 = add 0x9fed880, 0x9fd9ea0 [ORD=125115] [ID=13]
0x9fed880: i32,ch = CopyFromReg 0x3aafd68, 0x9fbea90 [ORD=125114] [ID=9]
0x9fbea90: i32 = Register %vreg13999 [ORD=125114] [ID=1]
0x9fd9ea0: i32 = Constant<80> [ORD=125115] [ID=2]
0x9feaab0: i32 = undef [ORD=125117] [ID=4]
In function: internal_function_69
Command exited with non-zero status 1

If I change the method, I use " %1 = fptrunc float %0 to half “, then " %2 = bitcast half %1 to i16”, I meet samiliar problem, the log is following:

LLVM ERROR: Cannot select: 0x9f554b0: ch = store 0x9d0f28c, 0x9f5d900, 0x9f54ba8, 0x9f54b20<ST2FixedStack0, trunc to f16> [ID=52]
0x9f5d900: f32,ch = load 0x9f5e290, 0x9f5dd40, 0x9f54b20<LD4[%sunkaddr69]> [ORD=1810] [ID=51]
0x9f5dd40: i32 = add 0x9f55318, 0x9f5e0f8 [ORD=1808] [ID=31]
0x9f55318: i32,ch = CopyFromReg 0x9d0f28c, 0x9f6a3a0 [ORD=1796] [ID=26]
0x9f6a3a0: i32 = Register %vreg32 [ORD=1796] [ID=1]
0x9f5e0f8: i32 = Constant<64> [ORD=1808] [ID=17]
0x9f54b20: i32 = undef [ORD=1797] [ID=6]
0x9f54ba8: i32 = FrameIndex<0> [ID=24]
0x9f54b20: i32 = undef [ORD=1797] [ID=6]
In function: testVCVTT32TO16Function

Anyone can help me?? Thank you again.

Hi,

Can you show us the command line you are using? At least can you tell us the backend you tried on? If you can upload the test case as well, it will be very useful to find out the problem.

Regards,
Kevin

Thank you Kevin!!!
If I use fptrunc and bitcast realise NEON vcvtt ( I can sure, “fptrunc double %tmp to float” is right, but “fptrunc float %tmp to half” is wrong). My target platform is MIPS. The command as following:

NEON:
vcvtt.f16.f32 s2, s0

llvm Code:

%Vt_2 = load float* %VFP_s0, align 4
%Vt3_1 = fptrunc float %Vt_2 to half
%Vt4_1 = bitcast half %Vt3_1 to i16
%Vt2_2 = bitcast float* %VFP_s2 to <2 x i16>*
%Vrti_1 = load <2 x i16>* %Vt2_2, align 4
%Vrti_2 = insertelement <2 x i16> %Vrti_1, i16 %Vt4_1, i32 1
%Vt2_3 = bitcast float* %VFP_s2 to <2 x i16>*
store <2 x i16> %Vrti_2, <2 x i16>* %Vt2_3, align 4

Error Log:

LLVM ERROR: Cannot select: 0x9f554b0: ch = store 0x9d0f28c, 0x9f5d900, 0x9f54ba8, 0x9f54b20<ST2FixedStack0, trunc to f16> [ID=52]
0x9f5d900: f32,ch = load 0x9f5e290, 0x9f5dd40, 0x9f54b20<LD4[%sunkaddr69]> [ORD=1810] [ID=51]
0x9f5dd40: i32 = add 0x9f55318, 0x9f5e0f8 [ORD=1808] [ID=31]
0x9f55318: i32,ch = CopyFromReg 0x9d0f28c, 0x9f6a3a0 [ORD=1796] [ID=26]
0x9f6a3a0: i32 = Register %vreg32 [ORD=1796] [ID=1]
0x9f5e0f8: i32 = Constant<64> [ORD=1808] [ID=17]
0x9f54b20: i32 = undef [ORD=1797] [ID=6]
0x9f54ba8: i32 = FrameIndex<0> [ID=24]
0x9f54b20: i32 = undef [ORD=1797] [ID=6]
In function: testVCVTT32TO16Function

AFAIK, MIPS doesn't support NEON. But that doesn't explain your
problem, which is probably due to hand-crafted IR.

Can you provide the full IR, with header and everything? Also the
command line that you're using with the LLVM tool you're using to
compile (llc?lli?).

It'd also be good to know how you generated that IR in the frist
place. Was it a tool? A front-end? Hand-crafted?

cheers,
--renato

Hi,

NEON is an ARM feature and is therefore not supported by MIPS so I assume you are trying to achieve the same effect. As far as I know, the MIPS backend doesn’t support half-precision floating point at the moment.

There is limited support for the <8 x f16> type when MSA (MIPS SIMD Architecture) is enabled but even then scalar half-precision is not currently supported.

I think that support for the half type is only implemented for ARM. Last I tried to use it, I found that none of it works even on x86, and the current handling of the half conversion SDNodes seem to rely on ARM specific assumptions

Have you tried using the @llvm.convert.to/from.fp16 intrinsics instead?

Not sure if this can help, but
if you really really want to have minimal half float support on Mips,
then one thing you could try to do is to hack MipsISelLowering.cpp
adding rules to expand float-half conversion SDNodes into library
calls.

+ setOperationAction(ISD::FP16_TO_FP32, MVT::f32, Expand);
+ setOperationAction(ISD::FP32_TO_FP16, MVT::i32, Expand);

(The MVT::i32 on the second rule is required because type i16 is
promoted to i32).

If you then convert every occurrence of 'fptrunc' from float to half
with calls to @llvm.convert.to.fp16, then you should be able to
compile (hopefully) with no errors.
That means, in your original example you would convert the following
IR statement:
  %Vt3_1 = fptrunc float %Vt_2 to half
into
  %Vt3_1 = call i16 @llvm.convert.to.fp16(float %Vt_2)

The downside is that you will have to add definitions for
'__gnu_f2h_ieee' and '__gnu_h2f_ieee' in the compiler runtime. That is
because the backend will expand all the float-half conversions into
library calls...
This workaround should work assuming that a) you can hack the backend,
and b) it is acceptable (i.e. a reasonable solution in your case) to
have a library call for every float-half conversion in your code.

I think that support for the half type is only implemented for ARM. Last I
tried to use it, I found that none of it works even on x86, and the current
handling of the half conversion SDNodes seem to rely on ARM specific
assumptions

Just for the record,
since revision 212293 (committed only five days ago), the x86 backend
supports float half conversions.
On x86, if the target has F16C, there are ISel patterns to map
float-half conversions to specific instructions. If there is no F16C
support, then the backend expands float-half conversions into runtime
library calls.

Cheers,
Andrea

Hi Renato,
Thank you your replying.
Yes, we are making a tool in order to generate MIPS IR by ARM asembler. The ir is Hand-crafted. We want to realise NEON instruction “vcvtt.f16.f32” by llvm.
Yes, MIPS is not supported NEON, but the problem is not related with mips platform. I think this problem only relate with llc of llvm, so this problem is about convertion between half float and float. The problem I don’t understand is if llvm support half float, how to convert float to half float.

Robin Lau

Hi,

NEON is an ARM feature and is therefore not supported by MIPS so I assume you are trying to achieve the same effect. As far as I know, the MIPS backend doesn’t support half-precision floating point at the moment.

There is limited support for the <8 x f16> type when MSA (MIPS SIMD Architecture) is enabled but even then scalar half-precision is not currently supported.

The documentation for MSA can be found at http://www.imgtec.com/mips/architectures/simd.asp. MSA was added to the architecture fairly recently and the P5600 (http://www.imgtec.com/mips/warrior/pclass.asp) is the first core to support it.

For the implementation, search for ‘addMSAFloatType(MVT::v8f16, &Mips::MSA128HRegClass);’ in lib/Target/Mips/MipsSEISelLowering.cpp. Most operations are expanded but it supports ISD::LOAD, ISD::STORE, ISD::BITCAST, ISD::EXTRACT_VECTOR_ELT, ISD::INSERT_VECTOR_ELT, BUILD_VECTOR, and a couple intrinsics. In MipsMSAInstrInto.td, you can also search for FEXDO_H, FEXUPL_W, and FEXUPR_W which are the only operations that use v8f16. There is no reference to the ‘f16’ type in the Mips backend so scalars are not implemented.

Hi Matt,
Thank you your replying. I try to use @llvm.convert.to/from.fp16 intrinsics before, I also meet “LLVM ERROR: Cannot select: fp32_to_fp16” problem, maybe these functions call fptrunc/fpext. Following is detail log:

LLVM ERROR: Cannot select: 0x9fc9680: i32 = fp32_to_fp16 0x9fc0750 [ID=16]
0x9fc0750: f32,ch = load 0x3aafd68, 0x9fc2a20, 0x9feaab0<LD4[%sunkaddr85033]> [ORD=125117] [ID=15]
0x9fc2a20: i32 = add 0x9fed880, 0x9fd9ea0 [ORD=125115] [ID=13]
0x9fed880: i32,ch = CopyFromReg 0x3aafd68, 0x9fbea90 [ORD=125114] [ID=9]
0x9fbea90: i32 = Register %vreg13999 [ORD=125114] [ID=1]
0x9fd9ea0: i32 = Constant<80> [ORD=125115] [ID=2]
0x9feaab0: i32 = undef [ORD=125117] [ID=4]
In function: internal_function_69
Command exited with non-zero status 1

This is fixed by Andrea’s patch about one week ago

The documentation for MSA can be found at http://www.imgtec.com/mips/architectures/simd.asp. MSA was added to the architecture fairly recently and the P5600 (http://www.imgtec.com/mips/warrior/pclass.asp) is the first core to support it.

For the implementation, search for ‘addMSAFloatType(MVT::v8f16, &Mips::MSA128HRegClass);’ in lib/Target/Mips/MipsSEISelLowering.cpp. Most operations are expanded but it supports ISD::LOAD, ISD::STORE, ISD::BITCAST, ISD::EXTRACT_VECTOR_ELT, ISD::INSERT_VECTOR_ELT, BUILD_VECTOR, and a couple intrinsics. In MipsMSAInstrInto.td, you can also search for FEXDO_H, FEXUPL_W, and FEXUPR_W which are the only operations that use v8f16. There is no reference to the ‘f16’ type in the Mips backend so scalars are not implemented.

Hi Andrea
Thank you your replying.
I do like your letter. Add following to line to MipsISelLowering.cpp. As your words, @llvm.convert.to.fp16 can compile successfully. However, the runtime is not right.

  • setOperationAction(ISD::FP16_TO_FP32, MVT::f32, Expand);
  • setOperationAction(ISD::FP32_TO_FP16, MVT::i32, Expand);

Robin

You may need implement these two functions in your runtime environment

‘__gnu_f2h_ieee’ and ‘__gnu_h2f_ieee’

This is fixed by Andrea’s patch about one week ago

You may need implement these two functions in your runtime environment

‘__gnu_f2h_ieee’ and ‘__gnu_h2f_ieee’