Bug in X86 assembler?


I find that the X86 assembler can compile code like “mov r8, 0x12345678” without any issues.

$ echo “mov r8, 0x12345678”|./bin/llvm-mc -assemble -show-encoding -x86-asm-syntax=intel -print-imm-hex -triple=x86_64
movq $0x12345678, %r8 # encoding: [0x49,0xc7,0xc0,0x78,0x56,0x34,0x12]

However, it fails to compile “mov r8, 0x1234567800”:

$ echo “mov r8, 0x1234567800”|./bin/llvm-mc -assemble -show-encoding -x86-asm-syntax=intel -print-imm-hex -triple=x86_64
:1:1: error: invalid operand for instruction
mov r8, 0x1234567800

Is this a bug?

Thank you,


You can’t use mov.

I am not an x86 expert, but after quick googling:

Recall that immediates are normally restricted to 32 bits. To load a larger constant into a quad register, use

movabsq, which takes a full 64-bit immediate as its source


Even in your example, assembler replaces mov by movq, which, say, hints:

$ echo “mov r8, 0x12345678”|./bin/llvm-mc -assemble -show-encoding -x86-asm-syntax=intel -print-imm-hex -triple=x86_64
movq $0x12345678, %r8 # encoding: [0x49,0xc7,0xc0,0x78,0x56,0x34,0x12]

I would recommend you some studying of x86 instruction set.

Another recommendation is to look at existing tests when you suspect a bug.



This is the difference between AT&T and Intel syntax (give llvm-mc the
-output-asm-variant=1 option to see what something closer to what's
been input).

The x86 reference manual really does seem to list the 64-bit immediate
move with mnemonic MOV. I'm not sure why GAS chose movabs, but this
probably is a bug in our Intel syntax support.



Yes, this is confused because different assembler takes different input.
For example, nasm accepts "mov" - but rejects "movabs", while LLVM only
accepts "movabs" (Intel syntax) in this case.

So this command will work for LLVM. Should we fix it so it works with
"mov", according to Intel manual?

    $ echo "movabs r8, 0x12345678"|./bin/llvm-mc -assemble -show-encoding
-x86-asm-syntax=intel -print-imm-hex -triple=x86_64