Alternate instruction encoding for subtargets

Hello,

I have a compiler in LLVM 2.9 for the KCPM3 processor. I'd like to
create a subtarget for the new cpu version called KCPSM6. Besides a
couple of new instructions which are not important at the moment, the
KCPSM6 cpu has different instruction opcodes. Semantically the
instructions are the same, hence I'd like to keep all the lowering and
pattern matching stuff unmodified

For example, the ADD sX, sY instruction in KCPSM3 is:
Inst{17-12} = 0b011000;
Inst{11-8} = sx;
Inst{7-4} = sy;
Inst{3-0} = 0;

While in KCPSM6 the same instruction is encoded:
Inst{17-12} = 0b010000;
Inst{11-8} = sx;
Inst{7-4} = sy;
Inst{3-0} = 0;

They even mostly kept the instruction formats!
Can I tell tablegen to have two encodings and switch between them
using a predicate?
I do not want to create new instructions (e.g. ADD_KCPSM3 and ADD_KCPSM6).
If that is not possible I will just dump the tablegen's
*GenCodeEmitter.inc file with the getBinaryCodeForInstr() and write it
by hand. I guess this is the only place where opcodes are used? (I do
not use LLVM's MC disassembler.)

Cheers,
Jara

Hello,

I have a compiler in LLVM 2.9 for the KCPM3 processor. I'd like to
create a subtarget for the new cpu version called KCPSM6. Besides a
couple of new instructions which are not important at the moment, the
KCPSM6 cpu has different instruction opcodes. Semantically the
instructions are the same, hence I'd like to keep all the lowering and
pattern matching stuff unmodified

For example, the ADD sX, sY instruction in KCPSM3 is:
Inst{17-12} = 0b011000;
Inst{11-8} = sx;
Inst{7-4} = sy;
Inst{3-0} = 0;

While in KCPSM6 the same instruction is encoded:
Inst{17-12} = 0b010000;
Inst{11-8} = sx;
Inst{7-4} = sy;
Inst{3-0} = 0;

We have similar problems with the R600 backend. The backend supports
four different subtargets that have semantically identical
instructions with the same encoding (actually one subtarget has slightly
different encoding, this is explained below), but with sometimes different
opcodes. We solve this problem by defining a base class to represent
the semantic definition of the instruction and then sub-classing it for
each sub-generation. For example:

class RECIP_UINT_Common <bits<32> inst> : R600_1OP <
  inst, "RECIP_INT $dst, $src",
  [(set R600_Reg32:$dst, (AMDGPUurecip R600_Reg32:$src))]

;

def RECIP_UINT_r600 : RECIP_UINT_Common <0x78>;
def RECIP_UINT_eg : RECIP_UINT_Common<0x94>;

We also have one subtarget that has a slightly different encoding from
the other three. We handle this by defining the instructions in
tablegen all using the most common encoding and then make an adjustment in the
CodeEmitter for the subtarget that requires a different encoding.

If you would like to see the R600 code, you can find it in branches/R600 of the
llvm SVN repo. There is also a git mirror
(http://cgit.freedesktop.org/~tstellar/llvm/tree/lib/Target/AMDGPU)
that may be easier to browse.

-Tom

If it's something that can be done programmatically (i.e., transform one encoding to the other, consider the use of a PostEncoderMethod). ARM uses this for VFP and NEON instruction encodings for Thumb2 vs. ARM mode encodings.

-Jim