Hi all,
I’m trying to implement a floating-point ‘min’ and ‘max’ operation using select. For ‘min’ I get the expected x86 assembly minss instruction, but for ‘max’ I get a branch instead of maxss.
The corresponding C syntax code looks like this:
float z = (x > y) ? x : y;
Any clues?
Your code is not safe for NaNs. This is the correct way to write maxss in C:
float max(float x, float y) {
return !(x < y) ? x : y;
}
If you don’t care about NaNs, you can pass -ffast-math to llvm-gcc, or set “UnsafeFPMath=true” from <llvm/Target/TargetOptions.h>
Could someone maybe explain to me the basics of LLVM’s target specific optimizations and code generation? I’d love to analyze things like this myself but I don’t know where to start.
This one specifically boils down to the semantics of maxss and LLVM IR instructions. For example, this code:
float not_max(float x, float y) {
return (x > y) ? x : y;
}
float really_max(float x, float y) {
return !(x < y) ? x : y;
}
compiles into this LLVM IR (llvm-gcc t.c -S -o - -O -emit-llvm):
define float @not_max(float %x, float %y) nounwind {
entry:
%tmp3 = fcmp ogt float %x, %y ; [#uses=1]
%iftmp.0.0 = select i1 %tmp3, float %x, float %y ; [#uses=1]
ret float %iftmp.0.0
}
define float @really_max(float %x, float %y) nounwind {
entry:
%tmp3 = fcmp uge float %x, %y ; [#uses=1]
%iftmp.1.0 = select i1 %tmp3, float %x, float %y ; [#uses=1]
ret float %iftmp.1.0
}
If you’re interested in target-specific x86 optimizations to be done, take a look at lib/Target/X86/README*.txt
-Chris