after years of 'trying&waiting' I still cannot get Clang to generate efficient code for float->int conversion on armv7 with either explicit round-to-nearest or 'current'/'ambient' rounding mode:
std::int32_t round( float const floatingPointValue )
return __builtin_lrintf( floatingPointValue );
"vcvtr.s32.f32 %0, %1" : "=w"( integerValue ) : "w"( floatingPointValue );
return __builtin_arm_vcvtr_f( floatingPointValue, 0 );
#else // fallback
return floatingPointValue + __builtin_copysignf( 0.5f, floatingPointValue );
A) 'fails'/is no good because https://llvm.org/bugs/show_bug.cgi?id=11544 ("Trivial math builtins not inlined") is still alive
B) crashes with an assertion ('why on earth' is clang distributed with assertions turned on?):
"error: couldn't allocate output register for constraint 'w'"
C) crashes with:
"fatal error: error in backend: Cannot select: intrinsic %llvm.arm.vcvtr"
...tested with Clang 3.6 from Android NDK r10e (latest) and Apple Clang from Xcode 7.2.1 (latest)...
Is there anything I can do to make Clang emit the vcvtr instruction?
ps. I stumbled on __builtin_arm_vcvtr_f by pure chance (it isn't documented anywhere, especially not its second parameter)...