I think I've found a bug in the calling convention support for X86-64/Win64. It doesn't correctly save and restore the XMM registers in the function prolog/epilog. (The problem only exists on Win64, since Linux and Mac OS use calling convention in which these registers are volatile and not callee-saved.)
X86RegisterInfo::getCalleeSavedRegs() when called for a Win64 target does return an array of registers which includes X86::XMM6 through X86:XMM15.
However, the prolog/epilog code does not seem to be able handle saving these registers correctly.
Firstly, in PEI::CalculateCalleeSavedRegisters() in CodeGen/PrologEpilogInserter.cpp, the call to Fn.getRegInfo().isPhysRegUsed(Reg) always seems to return true for the all of the XMM registers if the Function being emitted makes any function calls whatsoever, and so it tries to save all thecallee-saved XMM registers even when none are actually used.
Further, the prolog/epilog emitter doesn't know how to correctly save and restore the XMM registers on the stack. If outputting assembly, it tries to emit "PUSH XMM6" and such; no such instruction exists, and this does not assemble. If JIT'ting, it incorrectly emits PUSH instructions for other registers which happen to share bit encodings with the XMM registers (XMM6 becomes ESI, etc.)
Since there is no PUSH XMM* instruction, what needs to be done is to adjust the stack pointer directly and then use MOVAPS to write/read directly to the stack.
Is this already a known issue? Other than adding a custom calling convention, all of my experience with LLVM has been as a client, so I'm not sure how to proceed in fixing this, so if anyone could provide pointers, I'd appreciate it.