[TableGen] Define single instruction allowing Reg/Imm operands or split to multiple instructions instead?

Hi,

I have some questions with regard to instruction definition.

Preamble:

  1. My target allows immediate values for common arithmetic instructions, fma, fadd, … etc.
  2. I don’t need llvm’s assembler and disassembler components
    => I don’t need encode info in my InstrFormat.td and InstrInfo.td.

My first question is should I define an instruction which allows Register/Immediate operands, or should I define many instructions and each maps to an “operands-combination”.

For example:
Single instruction pattern

// Instruction definition
def f32Op : Operand<f32>;
def FMA : Inst<"fma", 
  (outs Dst32:$rd),
  (ins f32Op:$rn, f32Op:$rm, f32Op:$ra)>;

// Pattern definition
def F32ImmOp : Operand<f32>, FPImmLeaf<f32, [{ return true; }]>;
// rii
def : Pat<(any_fma Src32:$rn, F32ImmOp:$imm0, F32ImmOp:$imm1), 
          (FMA     Src32:$rn, F32ImmOp:$imm0, F32ImmOp:$imm1)>;
// rri
def : Pat<(any_fma Src32:$rn, Src32:$rm, F32ImmOp:$imm0), 
          (FMA     Src32:$rn, Src32:$rm, F32ImmOp:$imm0)>;

// rir, irr, iir, iri, iir, rrr. etc.

Many instructions pattern

def FMArii : Inst<"fma_rii", (outs Dst32:$rd), (ins Src32:$rn, F32ImmOp:$imm0, F32ImmOp:$imm1)>;
def FMArri : Inst<"fma_rri", (outs Dst32:$rd), (ins Src32:$rn, Src32:$rm, F32ImmOp:$imm0)>;
... rir, irr, iir, iri, iir, rrr. etc.

// rii
def : Pat<(any_fma Src32:$rn, F32ImmOp:$imm0, F32ImmOp:$imm1), 
          (FMArii  Src32:$rn, F32ImmOp:$imm0, F32ImmOp:$imm1)>;
// rri
....

Currently I prefer Single instruction pattern. Because I don’t want to create too many instruction variants, i.e. FMArii, FMArri, FMArir, …

Any comments on this?

My second question is, when defining outs and ins dags, it looks like the operand type doesn’t really matter, for example:

def f32Op : Operand<f32>;
def FMA : Inst<"fma", 
  (outs Dst32:$rd),
  (ins f32Op:$rn, f32Op:$rm, f32Op:$ra)>;

// or:
def FMA : Inst<"fma", 
  (outs Dst32:$rd),
  (ins Src32:$rn, Src32:$rm, Src32:$ra)>;
// Src32 is a specific register class.

I found Operand Type (DAGOperand or SDPatternOperator) is only matter in pattern definition (I am using GISel), i.e. I can use any operand in instruction definition and it has no any effect on pattern match. Is it because I don’t use llvm’s assembler/disassembler so I don’t really need to pay attention on operands when defining an instruction?

Thanks!
CY

For your first question, it’s a lot more common to separate them into two instructions. It’s easier to do so under the current pattern matching and TableGen infrastructure design.

You can make good uses of multiclass, foreach, string concatenation etc. to factor out common parts and have a clean TG codebase.

1 Like

HI!

I would also argue that defining the instructions in a clean way (your “many instructions pattern” approach) is the right way.

  • An LLVM backend is a multi-year effort. Using the “single instruction pattern” approach means that you have to rewrite the whole backend if your toolchain requirements ever changes.

  • Did you check the instructions flags? For example if you have a move instruction which supports both register and immediate variants, how do you set the and isMoveReg and isMoveImm flags with the “single instruction pattern” approach? Another example is isReMaterializable: an instruction with an immedate operand may be trivially rematerializable while the same instruction with a register is not. These flags are used by passes after instruction selection.

  • Assuming that the size of immediates is limited (e.g. 32 bit) you do not save too much coding. In your instruction selector you will often end up with code like

  if (isInt<32>(...) {
    MI = BuildMI(...)
             .addImm(...);
  } else {
    MI = BuildMI(...)
             .addReg(...);
  }

simply because the operands of a MachineInstr needs to be categorized (virtual register, physical register, immediate, etc.) Using the correct instruction opcode takes no effort in this case.

Regards,
Kai

1 Like

I consider the multiple instructions to be an old style used by x86 and many targets and I wouldn’t advise copying it. AMDGPU defines a family of custom operand types with more specific register and immediate constraints the verifier can enforce, rather than trying to define N variants for every combination of operands. I would never choose to switch to the combinatorial explosion way of handling this.

2 Likes

Hey @mshockwave , @redstar , @arsenm ,

I do really appreciate your feedback!

Hi Min-Yih,
Thanks for the hint :slight_smile: Yes we do also implement with these techniques in pattern matching, for example:

  • // Matches: (consumer (producer Op0Dag:$rn, Op1Dag:$rm), Op2Dag:$ra)
    multiclass ThreeOperandsFusePat<SDPatternOperator consumer,
                                    SDPatternOperator producer, Instruction inst,
                                    SDPatternOperator immOp = F32ImmOp> {
      foreach Op0Dag = [Src32, immOp] in {
        foreach Op1Dag = [Src32, immOp] in {
          foreach Op2Dag = [Src32, immOp] in {
            def : Pat<(consumer (producer Op0Dag:$rn, Op1Dag:$rm), Op2Dag:$ra),
                      (inst Op0Dag:$rn, Op1Dag:$rm, Op2Dag:$ra)>;
          }
        }
      }
    }
    
    and
    multiclass BinOpPat<SDPatternOperator op, string inst> {
      def : Pat<(op i32:$rn, I32ImmOp:$imm),
                (!cast<Instruction>(inst # i) Src32:$rn, I32ImmOp:$imm)>;
      ...
    

Hey redstar: Wow… thanks again :slight_smile: I guess I see the reason why single instruction might be a problem! Each instruction is represented by MCInstrDesc and it has Flag info. Then it will be very embarrassing when the llvm pipelines after GlobalISel/SelectionDAGISel try to ask:

  bool isMoveReg() const { return Flags & (1ULL << MCID::MoveReg); }
  bool isMoveImmediate() const { return Flags & (1ULL << MCID::MoveImm); }
  bool isRegSequenceLike() const { return Flags & (1ULL << MCID::RegSequence); }
  bool isRematerializable() const {
    return Flags & (1ULL << MCID::Rematerializable);
  }
  ...

But to be honest, I don’t know if my target needs these flags, but I can see at least isRematerializable is an important info in “Register Allocation”, “MachineLICM”, and if I go with Single instruction pattern, I will need to overload “TargetInstrInfo::isReallyTriviallyReMaterializable”.

One of the motivation from my side is, I have several places need to do:

case FMA:
  return doFmaStuff(...);

So if I have many variants, I will need to say: “oh… FMArrr, FMArri, FMArir, … are referring to the same doFmaStuff.” I know we can still apply some coding techniques to make finger’s life easier :stuck_out_tongue:

Hey arsenm,
Interesting! I am also reading AMGDPU’s implementation, could you point me out some reference code with regard to:

“a family of custom operand types with more specific register and immediate constraints the verifier can enforce”

From my point of view, I feel the old style is because of they also fill encoding info when defining instructions, but in my case, I don’t really need that.

Thanks,
CY

The instruction flags are orthogonal to the operand definitions. Things like isMoveImmediate work perfectly well on instructions that end up using a register operand. The use queries can check the operand type anyway.

The encoding also works fine with all-in-one.

Interesting! I am also reading AMGDPU’s implementation, could you point me out some reference code with regard to:

“a family of custom operand types with more specific register and immediate constraints the verifier can enforce”

SIRegOperands are mostly register operands that also accept different types of immediates. We have a custom set of checks in verifyInstruction for valid immediates

1 Like

== Update after some more experiments ==
(1) Single instruction pattern v.s. Many instructions pattern
I see a restriction with single instruction pattern, for example, I try to create a single store definition for different store value sizes:

def TOY_STORE : ToyInst<"toystore", [], (outs),
                    (ins Src64:$ptr, Src32:$offset, SrcAny:$val)>;

def p1 : PtrValueType<i64, 1>;

def : Pat<(store i32:$val, p1:$ptr),
            (TOY_STORE Src64:$ptr, (i32 0), Src32:$val)>;

def : Pat<(store v4i32:$val, p1:$ptr),
           (TOY_STORE Src64:$ptr, (i32 0), Src128:$val)>;

where the register class definition:

def Src32 : ToyRegClass<32, [i32, f32], (add ...)>;
def Src128 : ToyRegClass<128, [i128], (add ...)>;
def SrcAny : ToyRegClass<128, [i128, i32, f32], (add ...)>;

Then I got the error:

error: constant expression evaluates to 4294347258 which cannot be narrowed to type 'uint8_t' (aka 'unsigned char') [-Wc++11-narrowing]

Even though in ToyGenGlobalISel.inc, I can see tablegen creating these patterns:

    // Label 4: @521
    GIM_SwitchType, /*MI*/0, /*Op*/0, /*[*/1, 4, /*)*//*default:*//*Label 24*/ 784,
    /*GILLT_s32*//*Label 22*/ 530, 0,
    /*GILLT_s128*//*Label 23*/ 657,
    // Label 22: @530
...

But it is still fine to have immediate operands and register operands to share the same TOY_STORE, as long as the data type matches.

So I back to use traditional way and define many STOREs for different data sizes.



(2) when defining outs and ins dags, it looks like the operand type doesn’t really matter

Sorry I found the operand type matters. At that time I didn’t see the change is because of my register classes also have data type info, and tablegen will infer it when generating patterns.

ValueType allow us to create a pattern for a specific ValueType, for example:

  // lib/Target/AArch64/AArch64InstrInfo.td
  // VecTy = v2i32, v2f32, v4i16, v8i8, v4f16, v4bf16, ...
  def : Pat<(store (VecTy FPR:$Rt),
                   (ro.Wpat GPR64sp:$Rn, GPR32:$Rm, ro.Wext:$extend)),
            (STRW FPR:$Rt, GPR64sp:$Rn, GPR32:$Rm, ro.Wext:$extend)>;