Hi Guys,
I was working on some floating point intensive benchmarks and realize that the floating point register allocation in llvm assumes that there are only 7 floating point registers in X86, whereas the hardware has 8.
Line number
00266 assert(Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!");
of X86FloatingPoint.cpp.
Is there any reason for only counting from 0 to 6, when there are actually 8 in hardware ?
Is there an assumption somewhere else, that I am missing.
Thanks and Regards
Aparna Kotha
Graduate Student
University of Maryland, College Park
Hi Guys,
I was working on some floating point intensive benchmarks and realize that the floating point register allocation in llvm assumes that there are only 7 floating point registers in X86, whereas the hardware has 8.
Line number
00266 assert(Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!");
of X86FloatingPoint.cpp.
Is there any reason for only counting from 0 to 6, when there are actually 8 in hardware ?
It has to do with the weird tricks that are needed to generate code for a stack machine.
Is there an assumption somewhere else, that I am missing.
Yes, the default cpu on Linux is i386 which doesn't have SSE support.
Use SSE if you care about floating point performance. I think -mcpu=... is all you need.
/jakob
Right. But there are 8 registers on the floating point stack from ST0 to ST7 and I think llvm is only using ST0 to ST6 in some code fragments. Could this be because of the assumption that X86::FP registers run from X86::FP0 to X86:FP6 ?
–Aparna
Yes. My guess it that the code converting from FP to ST registers sometimes needs the extra stack slot.
/jakob
In case we want to have an unsafe register allocator that allocated FP0 to FP6 to include FP7 as well, where in the code should we add this?
–Aparna