Wrong encoding of movd on x64

Hi all,

I believe I’ve found a bug in the encoding of the movd instruction on x64. Here’s some IR code to reproduce it:

external global i8*, align 1 ; <i8**>:0 [#uses=1]

external global i8*, align 16 ; <i8**>:1 [#uses=1]

declare void @abort()

define internal void @2() {

%1 = load i8** @0, align 1 ; <i8*> [#uses=1]

%2 = load i8** @1, align 16 ; <i8*> [#uses=1]

%3 = bitcast i8* %2 to <8 x i8>* ; <<8 x i8>*> [#uses=1]

%4 = load <8 x i8>* %3, align 1 ; <<8 x i8>> [#uses=1]

%5 = bitcast <8 x i8> %4 to i64 ; [#uses=1]

%6 = trunc i64 %5 to i32 ; [#uses=1]

%7 = bitcast i8* %1 to i32* ; <i32*> [#uses=1]

store i32 %6, i32* %7, align 1

ret void

}

It generates the following code:

mov rax,1E417A8h

mov rax,qword ptr [rax]

mov rcx,1E417B0h

mov rcx,qword ptr [rcx]

movq mm0,mmword ptr [rcx]

movq rax,mm1

mov dword ptr [rax],ecx

Note the last movq. What was probably intended to be generated was “movd ecx, mm0”. LLVM mistakenly sets the ‘wide’ bit of the REX prefix to 1, turning movd into movq. Also, reg and r/m encoding has been swapped. I have not found a fix for this yet, but the MMX_MOVD64* definitions and patterns looks suspicious.

Note that on x86-32 it produces correct code (though not optimal either; it doesn’t use movd). Also, notice that the last two instructions above should ideally just be a single movd to memory, instead of first writing to a GP register.

In the same breath, I believe the encoding of MMX_MOVDQ2Qrr is incorrect. I’ve been able to fix the first two definitions by using MRMSrcReg (and set hasNoSideEffects). I’m not sure about the third definition though, is this for 3DNow! And should it use MRMSrcReg as well?

Thanks,

Nicolas

I believe I’ve found a bug in the encoding of the movd instruction on x64.
Here’s some IR code to reproduce it:

[snip

Note the last movq. What was probably intended to be generated was “movd
ecx, mm0”. LLVM mistakenly sets the ‘wide’ bit of the REX prefix to 1,
turning movd into movq. Also, reg and r/m encoding has been swapped. I have
not found a fix for this yet, but the MMX_MOVD64* definitions and patterns
looks suspicious.

Thanks for the testcase; fixed in r75142.

Note that on x86-32 it produces correct code (though not optimal either; it
doesn’t use movd). Also, notice that the last two instructions above should
ideally just be a single movd to memory, instead of first writing to a GP
register.

The suboptimal code on x86-32 is because there aren't patterns for i32
extract from <2 x i32> yet (so it gets expanded through memory). The
suboptimal code on x86-64 is due to a missing dagcombine; it doesn't
know that extracting an i64 and truncating it to an i32 is equivalent
to extracting an i32.

In the same breath, I believe the encoding of MMX_MOVDQ2Qrr is incorrect.
I’ve been able to fix the first two definitions by using MRMSrcReg (and set
hasNoSideEffects). I’m not sure about the third definition though, is this
for 3DNow! And should it use MRMSrcReg as well?

Also fixed in r75142. Note that MMX_MOVQ2FR64rr is actually exactly
the same instruction as MMX_MOVQ2DQrr; they're separated because it's
a bit more straightforward to write that way (FR64 and VR128 are both
XMM registers, but TableGen isn't aware of the precise relationship).

-Eli

Intel syntax is extremely confusing. I often can't figure out when to use movd or movq.

Evan