Pattern matching questions

I was able to resolve my previous question about dealing with custom
loads/stores, and following Chris' suggestion, the IBM Cell SPU backend
can generate code for "int main(void) { return 0; }" without crashing
llc. There's a lot of work still to be done... like getting frame
offsets correctly computed and hauling in the raft of intrinsics that
the Cell SDK defines.

Three quick questions:

- How does one deal with multiple instruction sequences in a pattern?
  To load a constant is a two instruction sequence, but both
  instructions only take two operands (assume that r3 is a 32-bit
  register):

  ilhu $3, 45 # r3 = (45 << 16)
  iohl $3, 5 # r3 |= 5

  I tried:

  def : Pat<(i32 imm:$imm),
            (IOHL (ILHU (HI16 imm:$imm)), (LO16 imm:$imm))>;

- The return instruction for Cell SPU is "bi $lr". How do I jam that
  into the instruction info w/o tblgen bitching up a storm about the
  "$" or the extra "bi" operands?

- Immediates in a pattern: To move one register to another involves
  using the 3-operand OR instruction, but how do I encode an immediate
  w/o a type inference contradiction?

  def : Pat<(set R32C:$rDest, R32C:$rSrc),
            (ORIr32 R32C:$rSrc, 0)>;

Thanks for the clue!

I was able to resolve my previous question about dealing with custom
loads/stores, and following Chris' suggestion, the IBM Cell SPU backend
can generate code for "int main(void) { return 0; }" without crashing
llc. There's a lot of work still to be done... like getting frame
offsets correctly computed and hauling in the raft of intrinsics that
the Cell SDK defines.

Three quick questions:

- How does one deal with multiple instruction sequences in a pattern?
  To load a constant is a two instruction sequence, but both
  instructions only take two operands (assume that r3 is a 32-bit
  register):

  ilhu $3, 45 # r3 = (45 << 16)
  iohl $3, 5 # r3 |= 5

  I tried:

  def : Pat<(i32 imm:$imm),
            (IOHL (ILHU (HI16 imm:$imm)), (LO16 imm:$imm))>;

It is possible to write multi-instruction pattern, e.g. X86InstrSSE.td line 1911. But how are you defining HI16 and LO16? Sounds like you want to define them as SDNodeXform that returns upper and lower 16 bits respectively. Take a look at PSxLDQ_imm in X86InstrSSE.td as an example.

- The return instruction for Cell SPU is "bi $lr". How do I jam that
  into the instruction info w/o tblgen bitching up a storm about the
  "$" or the extra "bi" operands?

I am not sure. Does "bi \$lr" works? Or "bi $$lr"? Or even something like
!strconcat("bi ", !strconcat("$", "lr")).

- Immediates in a pattern: To move one register to another involves
  using the 3-operand OR instruction, but how do I encode an immediate
  w/o a type inference contradiction?

  def : Pat<(set R32C:$rDest, R32C:$rSrc),
            (ORIr32 R32C:$rSrc, 0)>;

I am not sure what you mean. By 3-operand, you mean 2 source operand and 1 destination. I don't think the error you are seeing have anything to do with the immediate. For a def : Pat pattern, you don't need to specify the "set R32C:$rDest" portion.

Evan

- The return instruction for Cell SPU is "bi $lr". How do I jam that
  into the instruction info w/o tblgen bitching up a storm about the
  "$" or the extra "bi" operands?

I am not sure. Does "bi \$lr" works? Or "bi $$lr"? Or even something
like
!strconcat("bi ", !strconcat("$", "lr")).

I'll give the strconcat a try. Hadn't thought of that...

- Immediates in a pattern: To move one register to another involves
  using the 3-operand OR instruction, but how do I encode an immediate
  w/o a type inference contradiction?

  def : Pat<(set R32C:$rDest, R32C:$rSrc),
            (ORIr32 R32C:$rSrc, 0)>;

I am not sure what you mean. By 3-operand, you mean 2 source operand
and 1 destination. I don't think the error you are seeing have
anything to do with the immediate. For a def : Pat pattern, you don't
need to specify the "set R32C:$rDest" portion.

The PPC "OR" instruction has three operands (ok, in llvm-speak, it's probably two operands and one result). The SPU's ORI has two operands, one result. I was just thinking that I could get a reg-to-reg move encoded using this instruction, but couldn't because the immediate, 0, causes a type inference contradiction.

Q: Why don't I need the R32C:$rDest?

- How does one deal with multiple instruction sequences in a pattern?
  To load a constant is a two instruction sequence, but both
  instructions only take two operands (assume that r3 is a 32-bit
  register):

  ilhu $3, 45 # r3 = (45 << 16)
  iohl $3, 5 # r3 |= 5

  I tried:

  def : Pat<(i32 imm:$imm),
            (IOHL (ILHU (HI16 imm:$imm)), (LO16 imm:$imm))>;

It is possible to write multi-instruction pattern, e.g.
X86InstrSSE.td line 1911. But how are you defining HI16 and LO16?
Sounds like you want to define them as SDNodeXform that returns upper
and lower 16 bits respectively. Take a look at PSxLDQ_imm in
X86InstrSSE.td as an example.

Another good example is the PPC backend, which has the exact same issue for integer constants.

- The return instruction for Cell SPU is "bi $lr". How do I jam that
  into the instruction info w/o tblgen bitching up a storm about the
  "$" or the extra "bi" operands?

I am not sure. Does "bi \$lr" works? Or "bi $$lr"? Or even something
like
!strconcat("bi ", !strconcat("$", "lr")).

Yep, $$ should work.

- Immediates in a pattern: To move one register to another involves
  using the 3-operand OR instruction, but how do I encode an immediate
  w/o a type inference contradiction?

  def : Pat<(set R32C:$rDest, R32C:$rSrc),
            (ORIr32 R32C:$rSrc, 0)>;

You current cannot specify move patterns in the .td file. You specify them with XXXRegisterInfo::copyRegToReg and XXXInstrInfo::isMoveInstr. See the PPC or Sparc backend for some simple examples.

-Chris

Chris Lattner wrote:

It is possible to write multi-instruction pattern, e.g.
X86InstrSSE.td line 1911. But how are you defining HI16 and LO16?
Sounds like you want to define them as SDNodeXform that returns upper
and lower 16 bits respectively. Take a look at PSxLDQ_imm in
X86InstrSSE.td as an example.

Another good example is the PPC backend, which has the exact same issue
for integer constants.

Actually, for SPU, not quite the same:

def ILHU : RI16Form<0b010000010, (ops GPRC:$rT, u16imm:$val),
                    "ilhu $rT, $val", LoadNOP,
                    [(set GPRC:$rT, immZExt16:$val)]>;

def IOHL : RI16Form<0b100000110, (ops GPRC:$rT, u16imm:$val),
                    "iohl $rT, $val", LoadNOP,
                    [(set GPRC:$rT, immZExt16:$val)]>;

Thus, you can't really do as the PPC does, viz:

    (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))

vs.

    (IOHL (ILHU (HI16 imm:$imm)), (LO16 imm:$imm))

because there's only one operand to IOHL. PPC ORI is a two operand, one
result instruction. (I'm sure I'm bashing the vernacular badly.) My
question is how to link IOHL and ILHU together. Sequentially.

- The return instruction for Cell SPU is "bi $lr". How do I jam that
into the instruction info w/o tblgen bitching up a storm about the
"$" or the extra "bi" operands?

I am not sure. Does "bi \$lr" works? Or "bi $$lr"? Or even something
like
!strconcat("bi ", !strconcat("$", "lr")).

Yep, $$ should work.

It doesn't. Here's the pattern:

let isTerminator = 1, isBarrier = 1, noResults = 1 in {
  let isReturn = 1 in {
    def RET: BRForm<0b00010101100, (ops),
                        "bi $$lr",
                        BranchResolv,
                        [(retflag)]>;
  }
}

Output from make:

llvm[0]: Building SPU.td code emitter with tblgen
tblgen: /work/scottm/llvm/utils/TableGen/CodeGenInstruction.h:118:
std::pair<unsigned int, unsigned int>
llvm::CodeGenInstruction::getSubOperandNumber(unsigned int) const:
Assertion `i < OperandList.size() && "Invalid flat operand #"' failed.
make: ***
[/work/scottm/llvm/obj/i686-unknown-linux-gnu/lib/Target/IBMCellSPU/Debug/SPUGenCodeEmitter.inc.tmp]
Aborted

Whiskey Tango... Foxtrot?

Chris Lattner wrote:

It is possible to write multi-instruction pattern, e.g.
X86InstrSSE.td line 1911. But how are you defining HI16 and LO16?
Sounds like you want to define them as SDNodeXform that returns upper
and lower 16 bits respectively. Take a look at PSxLDQ_imm in
X86InstrSSE.td as an example.

Another good example is the PPC backend, which has the exact same issue
for integer constants.

Actually, for SPU, not quite the same:

def ILHU : RI16Form<0b010000010, (ops GPRC:$rT, u16imm:$val),
                    "ilhu $rT, $val", LoadNOP,
                    [(set GPRC:$rT, immZExt16:$val)]>;

def IOHL : RI16Form<0b100000110, (ops GPRC:$rT, u16imm:$val),
                    "iohl $rT, $val", LoadNOP,
                    [(set GPRC:$rT, immZExt16:$val)]>;

Thus, you can't really do as the PPC does, viz:

    (ORI (LIS (HI16 imm:$imm)), (LO16 imm:$imm))

vs.

    (IOHL (ILHU (HI16 imm:$imm)), (LO16 imm:$imm))

because there's only one operand to IOHL. PPC ORI is a two operand, one
result instruction. (I'm sure I'm bashing the vernacular badly.) My
question is how to link IOHL and ILHU together. Sequentially.

I am guessing IOHL would modify the lower 16-bits of the register while preserving the upper 16-bit? If so, you want to make it into a two-address opcode using operand constraints. Something like
def IOHL : RI16Form<0b100000110, (ops GPRC:$rT, GPRC:$rS, u16imm:$val),
   "iohl $rT, $val", "$rS = $rT" ...

Then you can have the result of ILHU as a source operand as IOHL.

- The return instruction for Cell SPU is "bi $lr". How do I jam that
into the instruction info w/o tblgen bitching up a storm about the
"$" or the extra "bi" operands?

I am not sure. Does "bi \$lr" works? Or "bi $$lr"? Or even something
like
!strconcat("bi ", !strconcat("$", "lr")).

Yep, $$ should work.

It doesn't. Here's the pattern:

let isTerminator = 1, isBarrier = 1, noResults = 1 in {
  let isReturn = 1 in {
    def RET: BRForm<0b00010101100, (ops),
                        "bi $$lr",
                        BranchResolv,
                        [(retflag)]>;
  }
}

Output from make:

llvm[0]: Building SPU.td code emitter with tblgen
tblgen: /work/scottm/llvm/utils/TableGen/CodeGenInstruction.h:118:
std::pair<unsigned int, unsigned int>
llvm::CodeGenInstruction::getSubOperandNumber(unsigned int) const:
Assertion `i < OperandList.size() && "Invalid flat operand #"' failed.
make: ***
[/work/scottm/llvm/obj/i686-unknown-linux-gnu/lib/Target/IBMCellSPU/Debug/SPUGenCodeEmitter.inc.tmp]
Aborted

Whiskey Tango... Foxtrot?

Please file a bug with a reduced test case for it.

Evan

It doesn't. Here's the pattern:

let isTerminator = 1, isBarrier = 1, noResults = 1 in {
  let isReturn = 1 in {
    def RET: BRForm<0b00010101100, (ops),
                        "bi $$lr",
                        BranchResolv,
                        [(retflag)]>;
  }
}

Output from make:

llvm[0]: Building SPU.td code emitter with tblgen
tblgen: /work/scottm/llvm/utils/TableGen/CodeGenInstruction.h:118:
std::pair<unsigned int, unsigned int>
llvm::CodeGenInstruction::getSubOperandNumber(unsigned int) const:
Assertion `i < OperandList.size() && "Invalid flat operand #"' failed.
make: ***

This is clearly a bug in tblgen.

Please file a bug with a reduced test case for it.

I attempted to add $$lr to other instruction patterns in existing targets, but couldn't reproduce it. Evan is right, we need a bug report or some way to reproduce it.

My guess is that you haven't specified all bits for the encoding of the RET instruction. If you disable generation of the code emitter for your target (i.e. remove SPUGenCodeEmitter.inc from your makefile) it should work around this.

-Chris