I’m working on the assembler and disassembler for a target that has compact-form instructions with implicit register operands. Even though the operands are implicit they appear in the assembly syntax. As a contrived example consider an instruction that adds R0 to another register. The syntax is “ADD R0,” even though R0 is not actually encoded.
The dilemma is how to treat that implied operand in the .td. Option 1 is to make it an implicit operand (omit from ins/outs and declare it as a Use), and hard-code the register name in the AsmString, as AsmString = “ADD\tR0,$dst”. If I do this the assembly parser does not know how to handle “R0” in the input. In this case it wants the MCParsedAsmOperand to be a token, but “R0” needs to be a register in other contexts.
So Option 2 is to make it an explicit register operand, with a regclass containing only R0, then declare it as a “normal” register operand in the ins list: InOperandList = (R0Class:$src); and AsmString = “ADD\t$src,$dst”. This works for the assembly parser, but breaks the disassembler. The disassembler will only instantiate operands that appear in the encoding field (Inst) in the instruction, so in this case $src is completely missing from the MCInst built by the disassembler.
Before I explore some hacky way around all this, I thought I’d see if anyone has suggestions.
-Alan
When I was working the Mips assembler, I generally found it best to handle any ambiguity in the operands in the MipsOperand class. We'd parse registers like $1 and set up MipsOperand such that isGPRAsmReg(), isFGRAsmReg(), and isCOP0AsmReg() would all return true. If we parsed a more specific name like $f1 then only isFGRAsmReg() would return true.
In a similar way, you may be able to have your R0 operand return true for isToken() and the appropriate is*AsmReg() and have the get*() functions and add*RegOperand() functions mutate the operand into the right kind for the context.
I’m working on the assembler and disassembler for a target that has compact-form instructions with implicit register operands. Even though the operands are implicit they appear in the assembly syntax. As a contrived example consider an instruction that adds R0 to another register. The syntax is “ADD R0,<dst>” even though R0 is not actually encoded.
The dilemma is how to treat that implied operand in the .td. Option 1 is to make it an implicit operand (omit from ins/outs and declare it as a Use), and hard-code the register name in the AsmString, as AsmString = “ADD\tR0,$dst”. If I do this the assembly parser does not know how to handle “R0” in the input. In this case it wants the MCParsedAsmOperand to be a token, but “R0” needs to be a register in other contexts.
So Option 2 is to make it an explicit register operand, with a regclass containing only R0, then declare it as a “normal” register operand in the ins list: InOperandList = (R0Class:$src); and AsmString = “ADD\t$src,$dst”. This works for the assembly parser, but breaks the disassembler. The disassembler will only instantiate operands that appear in the encoding field (Inst) in the instruction, so in this case $src is completely missing from the MCInst built by the disassembler.
You can use DecoderMethod to provide a function that does the right thing. Mips32r6 needs to do this to deal with an awkward quirk of the encoding table where it has places where register indices are compared to determine the opcode. You can find the pieces of that by searching for AddiGroupBranch.