Overlapping register groups in old 8-bit MC6809 processor.

Hi

I'm returning to my MC6809 back-end from a health-related hiatus. The assembler is tantalisingly close, but I've got some parsing and matching problems.

The register set; these overlap in annoying ways, for instance, two instructions TFR and EXG each have a single opcode, and the post-byte specifies which registers are to be involved, but the registers can be 8- or 16-bit, and 2 of the 8-bit registers make up one of the 16-bit registers (A + B make D).

See https://en.wikipedia.org/wiki/Motorola_6809

it is legal to (e.g.) "TFR A,B", or "EXG D,X", but I've having a problem setting these registers up in my lib/Target/MC6809/MC6809RegisterInfo.td -

//===-- MC6809RegisterInfo.td - MC6809 Register defs -------*- tablegen -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//

//===----------------------------------------------------------------------===//
// Declarations that describe the MC6809 register file
//===----------------------------------------------------------------------===//

class MC6809Reg<bits<4> num, string n, list<string> alt = []>
  : Register<n> {
  field bits<4> Num = num;
  let Namespace = "MC6809";
  let HWEncoding{3-0} = num;
  let AltNames = alt;
}

class MC6809RegWithSubregs<bits<4> num, string n, list<Register> subregs, list<string> alt = []>
  : RegisterWithSubRegs<n, subregs> {
  field bits<4> Num = num;
  let Namespace = "MC6809";
  let HWEncoding{3-0} = num;
  let AltNames = alt;
}

def sub_lo_byte : SubRegIndex<8, 8>;
def sub_hi_byte : SubRegIndex<8>;

def sub_lo_word : SubRegIndex<16, 16>;
def sub_hi_word : SubRegIndex<16>;

//===----------------------------------------------------------------------===//
// Registers
//===----------------------------------------------------------------------===//

def IX : MC6809Reg<1, "x">;
def IY : MC6809Reg<2, "y">;
def SU : MC6809Reg<3, "u">;
def SS : MC6809Reg<4, "s">;
def PC : MC6809Reg<5, "pc">;
def AV : MC6809Reg<7, "v">;
def AA : MC6809Reg<8, "a">;
def AB : MC6809Reg<9, "b">;
def CC : MC6809Reg<10, "cc">;
def DP : MC6809Reg<11, "dp">;
def A0 : MC6809Reg<12, "0">;
def AE : MC6809Reg<14, "e">;
def AF : MC6809Reg<15, "f">;

let SubRegIndices = [sub_hi_byte, sub_lo_byte], CoveredBySubRegs = 1 in {
  def AD : MC6809RegWithSubregs<0, "d", [AA,AB], ["a","b"]>;
  def AW : MC6809RegWithSubregs<6, "w", [AE,AF], ["e","f"]>;
}

let SubRegIndices = [sub_hi_word, sub_lo_word], CoveredBySubRegs = 1 in {
  def AQ : MC6809RegWithSubregs<0, "q", [AD,AW], ["d","w"]>;
}

def GR8 : RegisterClass<"MC6809", [i8], 8, (add AA, AB, CC, DP, AE, AF)>;

def GR16 : RegisterClass<"MC6809", [i16], 8, (add AD, IX, IY, SU, SS, PC, AW, AV, A0)>;

def GR32 : RegisterClass<"MC6809", [i32], 8, (add AQ)>;

def IX16 : RegisterClass<"MC6809", [i16], 8, (add IX, IY, SU, SS)>;

def WREG : RegisterClass<"MC6809", [i16], 8, (add AW)>;

def ALLREG : RegisterClass<"MC6809", [i8, i16], 8, (add AD, IX, IY, SU, SS, PC, AW, AV, AA, AB, CC, DP, A0, AE, AF)>;
//===----------------------------------------------------------------------===//

(The extra registers are for the Hitachi HD6309 extended version)

TableGen doesn't like that last line, the ALLREG group. I need it because the EXG and TFR instructions both take 2 register indices from either GR16 or GR8 (I'm happy to check that both are from the same group in C++ code if necessary, but right now I'm stuck with either specifying two operands from GR16 or 2 from GR8.

TableGen's objection is below. I can't make sense of the "too heavy" part. Could one of you kind folks clue me in please? Without the above ALLREG group, the code compiles and while not complete or altogether correct, is testable and somewhat functional.

Assertion failed: (RU.Weight < 256 && "RegUnit too heavy"), function EmitRegUnitPressure, file /usr/local/src/llvm/llvm/utils/TableGen/RegisterInfoEmitter.cpp, line 237.
0 llvm-tblgen 0x000000010bc2b66c llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 60
1 llvm-tblgen 0x000000010bc2bc09 PrintStackTraceSignalHandler(void*) + 25
2 llvm-tblgen 0x000000010bc28bbe llvm::sys::RunSignalHandlers() + 990
3 llvm-tblgen 0x000000010bc2f1f9 SignalHandler(int) + 505
4 libsystem_platform.dylib 0x00007fff5cd83b3d _sigtramp + 29
5 libsystem_platform.dylib 0x00007fd1ddafc080 _sigtramp + 2161608032
6 libsystem_c.dylib 0x00007fff5cc411c9 abort + 127
7 libsystem_c.dylib 0x00007fff5cc09868 basename_r + 0
8 llvm-tblgen 0x000000010ba53940 (anonymous namespace)::RegisterInfoEmitter::EmitRegUnitPressure(llvm::raw_ostream&, llvm::CodeGenRegBank const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 2064
9 llvm-tblgen 0x000000010ba2f942 (anonymous namespace)::RegisterInfoEmitter::runTargetDesc(llvm::raw_ostream&, llvm::CodeGenTarget&, llvm::CodeGenRegBank&) + 18402
10 llvm-tblgen 0x000000010ba25658 (anonymous namespace)::RegisterInfoEmitter::run(llvm::raw_ostream&) + 120
11 llvm-tblgen 0x000000010ba25596 llvm::EmitRegisterInfo(llvm::RecordKeeper&, llvm::raw_ostream&) + 54
12 llvm-tblgen 0x000000010bad6f28 (anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&) + 184
13 llvm-tblgen 0x000000010bc3b54f llvm::TableGenMain(char*, bool (*)(llvm::raw_ostream&, llvm::RecordKeeper&)) + 3023
14 llvm-tblgen 0x000000010bad6e1a main + 154
15 libdyld.dylib 0x00007fff5cb98ed9 start + 1
16 libdyld.dylib 0x000000000000000b start + 2739302707
Stack dump:
0. Program arguments: ../../../bin/llvm-tblgen -gen-register-info -I /usr/local/src/llvm/llvm/lib/Target/MC6809 -I /usr/local/src/llvm/llvm/include -I /usr/local/src/llvm/llvm/lib/Target /usr/local/src/llvm/llvm/lib/Target/MC6809/MC6809.td -o /usr/local/src/llvm/build/lib/Target/MC6809/MC6809GenRegisterInfo.inc
/bin/sh: line 1: 15280 Abort trap: 6 ../../../bin/llvm-tblgen -gen-register-info -I /usr/local/src/llvm/llvm/lib/Target/MC6809 -I /usr/local/src/llvm/llvm/include -I /usr/local/src/llvm/llvm/lib/Target /usr/local/src/llvm/llvm/lib/Target/MC6809/MC6809.td -o /usr/local/src/llvm/build/lib/Target/MC6809/MC6809GenRegisterInfo.inc
make[2]: *** [lib/Target/MC6809/MC6809GenRegisterInfo.inc] Error 134
make[1]: *** [lib/Target/MC6809/CMakeFiles/MC6809CommonTableGen.dir/all] Error 2

M

Hi Mark,

TableGen doesn't like that last line, the ALLREG group. I need it because the EXG and TFR instructions both take 2 register indices from either GR16 or GR8 (I'm happy to check that both are from the same group in C++ code if necessary, but right now I'm stuck with either specifying two operands from GR16 or 2 from GR8.

Have you considered modelling it as two separate instructions, TFR8
and TFR16 for example? That seems like it'd fit into LLVM's ideas
about register classes a lot more neatly, and as an added bonus it'd
automatically enforce the size constraint.

It's a reasonably common technique in LLVM, because what a datasheet
calls separate instructions is often pretty arbitrary.

TableGen's objection is below. I can't make sense of the "too heavy" part.

It's referring to the "Weight" of a register class, which I'm a bit
fuzzy on but I think is related to how expensive copying would be. I
suspect it's going haywire because ALLREG attempts to unify a register
with its own subregisters (if AD is AA + AB it ought to be 2x as
heavy, but OTOH LLVM wants all registers in the same class to have a
consistent weight).

I don't have a solution though. I suspect ALLREG is fundamentally
unsound as far as LLVM is concerned and this is just the tip of the
iceberg.

Cheers.

Tim.

Hi Mark,

TableGen doesn't like that last line, the ALLREG group. I need it because the EXG and TFR instructions both take 2 register indices from either GR16 or GR8 (I'm happy to check that both are from the same group in C++ code if necessary, but right now I'm stuck with either specifying two operands from GR16 or 2 from GR8.

Have you considered modelling it as two separate instructions, TFR8
and TFR16 for example? That seems like it'd fit into LLVM's ideas
about register classes a lot more neatly, and as an added bonus it'd
automatically enforce the size constraint.

I did think of that, in fact I tried it too, but as the instruction has the same opcode and mnemonic (the post-byte defines the registers used), and it's thus possible to encode (e.g.) "EXG A,D" (undefined result). I kept hitting "already defined" problems or the match table being degenerate.

Right now I'm getting disassembly pretty-much free-of-charge but without full use of EXG and TFR. I may try to code TFR.W vs TFR.B variants (again) but I was hoping not to get my hands that dirty in the disassembler.

I don't have a solution though. I suspect ALLREG is fundamentally
unsound as far as LLVM is concerned and this is just the tip of the
iceberg.

Darn :-).

Thanks!

M