[x86 codegen] 3DNow! intrinsics not behaving as expected.

I finally got all of the 3DNow! instruction intrinsics and builtins
into LLVM and Clang, however, while testing them, I've noticed that
they produce incorrect results.

For example:

typedef float V2f __attribute__((vector_size(8)));

int main() {
  V2f dest, a = {1.0, 3.0}, b = {10.0, 3.5};
  dest = __builtin_ia32_pfadd(a, b);
  printf("(%f, %f)\n", dest[0], dest[1]);
}

Should output (11, 6.5). However, it outputs different values
depending on the optimization level. Generally one of them is correct,
and the other is -nan.

I looked at the program using a debugger, and the pfadd instruction is
executed correctly and the MMX register contains the correct values.
The code that prepares the stack for the printf call seems to be
messing it up.

Here's the assembly generated at O3 for the above:

  .file "intrin.c"
  .text
  .globl main
  .align 16, 0x90
  .type main,@function
main: # @main
# BB#0: # %entry
  pushl %ebp
  movl %esp, %ebp
  subl $56, %esp
  movl $1077936128, -12(%ebp) # imm = 0x40400000
  movl $1065353216, -16(%ebp) # imm = 0x3F800000
  movl $1080033280, -4(%ebp) # imm = 0x40600000
  movl $1092616192, -8(%ebp) # imm = 0x41200000
  movq -16(%ebp), %mm0
  pfadd -8(%ebp), %mm0
  movq %mm0, -24(%ebp)
  flds -20(%ebp)
  fstpl 12(%esp)
  flds -24(%ebp)
  fstpl 4(%esp)
  movl $.L.str, (%esp)
  calll printf
  xorl %eax, %eax
  addl $56, %esp
  popl %ebp
  ret
.Ltmp0:
  .size main, .Ltmp0-main

  .type .L.str,@object # @.str
  .section .rodata.str1.1,"aMS",@progbits,1
.L.str:
  .asciz "%f, %f\n"
  .size .L.str, 8

  .section ".note.GNU-stack","",@progbits

Attached are my patches to enable support for this. I'd like to be
done with this, because 3DNow! isn't even supported anymore. I was
just adding these to learn tblgen and fill in some of the MSVC
intrinsic headers.

- Michael Spencer

3dnow-builtins.patch (23.2 KB)

clang-3dnow-builtins.patch (8.08 KB)

I would call that "user error"; basically, using MMX instructions
messes up the FP stack, and we assume the user is smart enough to make
sure the two don't mix.

-Eli

More specifically, if you use MMX/3dNow intrinsics, you have to call "emms" at ABI boundaries.

-Chris

Ok, now it makes sense. Thanks. Now that I know this is right, are
these patches ok to commit? Or should I post them to the llvm-commits,
and cfe-commits list?

- Michael Spencer

If they seem obvious to you, feel free to commit them directly. Thanks Michael,

-Chris