Bogus X86-64 Patterns

Tracking down a problem with one of our benchmark codes, we've discovered that
some of the patterns in X86InstrX86-64.td are wrong. Specifically:

def MOV64toPQIrm : RPDI<0x6E, MRMSrcMem, (outs VR128:$dst), (ins i64mem:$src),
                        "mov{d|q}\t{$src, $dst|$dst, $src}",
                        [(set VR128:$dst,
                          (v2i64 (scalar_to_vector (loadi64 addr:$src))))]>;

def MOVPQIto64mr : RPDI<0x7E, MRMDestMem, (outs), (ins i64mem:$dst, VR128:
$src),
                         "mov{d|q}\t{$src, $dst|$dst, $src}",
                         [(store (i64 (vector_extract (v2i64 VR128:$src),
                                       (iPTR 0))), addr:$dst)]>;

These say that for an AT&T-style assembler, output movd and for an Intel
assembler, output movq.

The problem is that such movs to and from memory must either use movq
or put a rex prefix before the movd. No such rex prefix appears in these
patterns, meaning the instructions will move 32 bits rather than 64 bits.

Bad things happen.

A little sleuthing uncovered this:

$ svn log -r32609
~/svn/test/official.llvm/llvm/lib/Target/X86/X86InstrX86-64.td

Tracking down a problem with one of our benchmark codes, we've discovered that
some of the patterns in X86InstrX86-64.td are wrong. Specifically:

def MOV64toPQIrm : RPDI<0x6E, MRMSrcMem, (outs VR128:$dst), (ins i64mem:$src),
                        "mov{d|q}\t{$src, $dst|$dst, $src}",
                        [(set VR128:$dst,
                          (v2i64 (scalar_to_vector (loadi64 addr:$src))))]>;

def MOVPQIto64mr : RPDI<0x7E, MRMDestMem, (outs), (ins i64mem:$dst, VR128:
$src),
                         "mov{d|q}\t{$src, $dst|$dst, $src}",
                         [(store (i64 (vector_extract (v2i64 VR128:$src),
                                       (iPTR 0))), addr:$dst)]>;

These say that for an AT&T-style assembler, output movd and for an Intel
assembler, output movq.

Right.

The problem is that such movs to and from memory must either use movq
or put a rex prefix before the movd. No such rex prefix appears in these
patterns, meaning the instructions will move 32 bits rather than 64 bits.

You mean there should be a "rex" in the assembly string? That is,

rex movd %rax, %xmm0

The Mac OS X assembly does not take rex prefix. So this is what's expected:
movd %rax, %xmm0

$ otool64 -t a.o
a.o:
(__TEXT,__text) section
00000000 6e0f4866 000000c0

For the rr variants it is ok because the assembler sees the r register use and
adds the necessary rex prefix. It cannot do so for memory moves.

These patterns are just wrong. So either the MaxOS assembler adds a rex
(which is wrong if you really wanted 32 bits) or these patterns are broken on
MaxOS as well and the MacOS assembler really needs to be fixed.

Hrm. Good point. Looks like the Mac OS X assembler is always adding rex.w:

movd %xmm0, (%eax)
$ otool64 -t a.o
a.o:
(__TEXT,__text) section
00000000 7e0f6667 00000000

I'll ping our assembler guy.

We are changing these to movq in our copy of llvm. Shall I send the changes
upstream?

Hold on, changing to movq will break it on Mac OS X. I'll need to find a proper solution.

Thanks,

Evan

The proper solution is to fix the assembler, but in the meantime,
the mr and rm variants at least should be changed to movq. They
are broken on MacOS X right now anyway. From what you've said,
one can't do a 32-bit memory mov to/from an xmm on MacOS X.
You still won't be able to do a 32-bit memory operation but at least
systems that use the latest binutils will work properly.

                                          -Dave

Hold on, changing to movq will break it on Mac OS X. I'll need to
find a proper solution.

Thanks,

The proper solution is to fix the assembler, but in the meantime,
the mr and rm variants at least should be changed to movq. They
are broken on MacOS X right now anyway. From what you've said,
one can't do a 32-bit memory mov to/from an xmm on MacOS X.
You still won't be able to do a 32-bit memory operation but at least
systems that use the latest binutils will work properly.

I should get an answer today. The right workaround now is to disable this for Darwin.

Thanks!

Evan

Actually, movq is recognized by the assembler on Leopard. I've fixed this:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20071210/056334.html

Please verify. Thanks.

Evan

The patch is similar to what I did. I'll have to create an llvm testcase to
verify it as this is a Fortran code. It would also be good to have a
regression test for this anyway.

Does the Leopard assembler support movq for rr instructions? The
rep64 movd sequence is ugly and esoteric. It's much more readable
to say movq.

                                           -Dave

Actually, movq is recognized by the assembler on Leopard. I've fixed
this:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-
Mon-20071210/056334.html

Please verify. Thanks.

The patch is similar to what I did. I'll have to create an llvm testcase to
verify it as this is a Fortran code. It would also be good to have a
regression test for this anyway.

Does the Leopard assembler support movq for rr instructions? The
rep64 movd sequence is ugly and esoteric. It's much more readable
to say movq.

I agree. But unfortunately the assembler still does not recognize movq for rr instructions. If this really bug you I'd suggest adding a comment to the emitted assembly string.

Evan