A simpler method to reject undefined encodings

Hello.

Sometimes the constraints imposed on an instruction’s encoding are too complex to be described in tablegen alone. In such cases a custom decoder method is implemented. This makes sense when the decoding itself is very complex, but it is wasteful to do it only when checking additional constraints. This is because:

  1. a custom decoder method has to decode operands, set opcodes, etc. - many times the tablegen generated decoder is perfectly capable of doing this job so there is no reason in doing it by hand
  2. when custom decoder methods are implemented for similar-looking instructions there is high chance of code duplication (i.e. reading the fields out of the encoding, setting opcodes using the same logic but with different enums, etc.).

Here is a practical example: ARM NEON vector instructions may work in either double word mode (taking d registers) or in quad word mode (taking q registers). The mode is selected by bit 6 (Q bit). When Q == 1, then quad word mode is used.

There is an additional constraint: whenever Q == 1, the registers encoded in the instruction need to be even. Otherwise the encoding is undefined.

This constraint is currently unimplemented and triggers incorrect behaviour in the MC disassembler. In order to correct this I would have to create custom decoder methods for a dozen-some instructions which is wasteful. I would much prefer to be able to define a constraint function like:

static DecodeStatus CheckNEONConstraint(const MCInst &Inst, unsigned Insn)

{
Vd = fieldFromInstruction(Insn, 12, 4);
Vm = fieldFromInstruction(Insn, 0, 4);
Vn = fieldFromInstruction(Insn, 16, 4);
Q = fieldFromInstruction(Insn, 6, 1);

if (Q == 1 && ((Vd & 1) || (Vm & 1) || (Vn & 1)))

return MCDisassembler::Fail;

return MCDIsassembler::Success;
}

Then I’d like to use this in the td file with something along the lines of

ConstraintCheckMethod = “CheckNEONConstraint”

in the superclass of NEON instructions.

I envision this as the very last step in deciding whether an encoding is valid or not. In my example the tablegen generated decoder would “decode” the invalid encoding and create a MCInst but, because this final test returns failure, the encoding is rejected.

What do you guys think about this? I feel it’ll generally simplify things and could be an opportunity to get rid of some of the custom decoder methods.

Mihai

Hi Mihail,

static DecodeStatus CheckNEONConstraint(const MCInst &Inst, unsigned Insn)

[...]

ConstraintCheckMethod = "CheckNEONConstraint"

In general I like the idea of an instruction-validation method. I
think it could also potentially solve the SoftFail/UNPREDICTABLE
issues that are looming (and partially resolved for decoding at
present).

However, I think that to cope with the other issues the scope of the
method should (at least potentially) be wider than just disassembly.
For example we will eventually want to reject some load instructions
based on complex operand constraints (e.g. if it's write-back indexed
and Rt == Rn). Currently the only way to handle cross-operand
conditions in the AsmParser (I believe) is with a custom
ConvertMethod, which is even more heavyweight than the disassembler
situation.

I think making this checker work for assembly would be rather simple
if the method acts on an MCInst, but probably not possible if it also
gets to use the bit-pattern as suggested.

In addition, in my view working on the bit-pattern adds redundancy and
potential bugs where the decoder and checker interpret the bits
differently.

I've been wrong about MC layer design before though. Jim's been very
helpful at explaining the error of my ways in the past; I've added him
on CC.

Cheers.

Tim.

Tim, Jim,

I don’t think it would be very helpful if such a method will act on the MCInst alone. The reason is that certain conditions are far easier and less error prone to check for directly on bitpatterns.
Take for example the following two NEON instructions: vhadd.i32 d0, d1, d2 and vhadd.i32 q0, q1,q2.

If all we have is the MCInst, then it gets difficult to check the restriction I mentioned:

  1. it will be hard to tell the operands apart. While it is true that the operands of the first instruction will be part of the set { ARM::D0, … , ARM::D31 } while the operands of the second are part of the set { ARM::Q0, … , ARM::Q15 }, they are only enums and not type safe. They only way to check that a register operand belongs to one set or another is to test for boundaries of the enum sets. However this now becomes dependent on tablegen’s behaviour - to my knowledge there is no guarantee that enums representing registers from the same register file will be dumped in sequence. It is natural behaviour, but unless it’s clearly documented this happens we should not rely on this.

  2. Remember the nature of the restriction: every time bit Q == 1, the encoded register fields need to contain even numbers. I do not have access to the Q bit, so I would need to take into account the value of the opcode and remember which are the opcodes represent encodings with Q == 1.

  3. Because of 2, I would potentially have to apply this restriction checker in multiple places, only for the instruction definitions which should have Q == 1. If I have access to the Q bit I can set this restriction checker in a upper class and leave the behaviour to be inherited by all definitions.

The truth of the matter is that certain conditions are more easily checked on bitpatterns while others are more easily checked on MCInsts - this is why I believe it’s useful to have both.

In order to use the same mechanism for assembly, we can define the prototype for such restriction checking methods as:

static DecodeStatus CheckConstraint(const MCInst &Inst, unsigned Insn = 0);

Tablegen will be able to pass a single parameter when assembling and the implementation of the checker will simply have to ignore the zero’ed out bit pattern.

Regards,
Mihai

Hi Mihail,

Here is a practical example: ARM NEON vector instructions may work in either double word mode (taking d registers) or in quad word mode (taking q registers). The mode is selected by bit 6 (Q bit). When Q == 1, then quad word mode is used.

There is an additional constraint: whenever Q == 1, the registers encoded in the instruction need to be even. Otherwise the encoding is undefined.

This constraint is currently unimplemented and triggers incorrect behaviour in the MC disassembler. In order to correct this I would have to create custom decoder methods for a dozen-some instructions which is wasteful. I would much prefer to be able to define a constraint function like:

Actually, in this case don’t you just have to add another check to DecodeQPRRegisterClass? It currently completely ignores the least-significant bit of its field, but should probably check that it’s 0.

Or are there instructions that permit that bit to vary without actually affecting anything? I’m not aware of any, but if so the best solution is probably to create a second RegisterOperand class to handle this disparity and migrate some instructions over to it.

Cheers.

Tim.