Parsing assembler in the backend.


I'm making slow/steady progress with the MC6809 backend. I can assemble the full instruction set, including HD6309 extensions, but I'm having some issues with some quirky HD6309 addressing modes.

For example, consider this instruction:

def LDAiIncW2 : MC6809LoadIndexed_iIncW2_P1<
                "lda\t,w{ }++", <<=== AsmString
                0xA6, <<=== Opcode, goes into Inst bits.

, Requires<[IsHD6309]> { let Inst{15-8} = opcode; let Inst{7-0} = 0b11001111; let Defs = [AA,AW]; let Uses = [AW]; }

Yes, that spurious-looking comma is correct!

I've had to add that optional space into the AsmString in order to get it to parse:

$ bin/llvm-mc -triple mc6809 -show-inst-operands -show-inst -show-encoding <<< "lda ,w++"
<stdin>:1:1: note: parsed instruction: ['lda', <register 8>, '++']
lda ,w++
  lda ,w ++ ; encoding: [0xa6,0xcf]
                                        ; <MCInst #1871 LDAiIncW2
                                        ; <MCOperand Imm:0>>

... and I *don't* want the space to come back between the "w" and the "++". on that "encoding" line.

If I remove the space in the AsmString like "lda\t,w++", then Tablegen(?) tries to parse "w++" as a single symbol, and the parsing fails. Rather than adding yet another special case to my already nasty parser, is there any way to write that AsmString in such a way that the space doesn't appear, and the "w" and the "++" are not combined? Is there a better way of writing the { }, in other words?

Also, the "w" register gets turned into an Imm:0 operand; is there a way of suppressing this? as long as the line matches, I have enough information to generate the whole instruction without looking at the register any further.