Intel asm syntax and variable names

Hi all,

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

cat test_good.s

mov eax, flags2

llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

.text

movl flags2, %eax

But if the memory location is named “flags”, llvm-mc fails:

cat test_bad.s

mov eax, flags

llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

.text

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

case ‘f’: // 1 string to match.

if (memcmp(Name.data()+1, “lags”, 4))

break;

return 25; // “flags”

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

Thanks,

Marina

Suppose I have a global variable named ‘EAX’. How do Intel assemblers normally escape register names to access such a global variable?

Microsoft assembler treats mov to EAX as a register, even if there is a global memory also named EAX – meaning the register takes precedence.

But here I have a bit of a different situation – I have a global variable, which name happens to match an implicit register or a register that does not exist in the current arch, just in future ones. Microsoft assembler treats these cases as memory locations, llvm treats them as registers, causing compilation errors.

So, there is no prior art for escaping the name of a global symbol with the same name as a register? If there is, I’d rather we just implement it and leave it at that.

We can probably fix the ‘flags’ case easily in LLVM, but I’d rather not bend over backwards to make ZMM0 be a global name when AVX is disabled.

Some targets don’t have the problem because they prefix all names with an undercore. Apart from that I am not aware of any solution to the problem of keywords clashing with variable names in intel syntax.

  • Matthias

Hi,

I’ve uploaded a workaround for the issue.

Please let me know if you think it’s ok.

http://reviews.llvm.org/D11512

http://reviews.llvm.org/D11513

Thanks,

Marina