TableGen Syntax while defining instructions in .td files

Hi, I am trying to write a “toy” llvm backend that supports a subset of RISC-V instruction in order to learn llvm backend development.

1.While defining records for Instructions, I have seen people using $ sign with register names in string field as well as in the list field. What does $ sign specify. I have attached code below for reference:

def ADD : RType<0b000,0b0000000,(outs GRRegs:$rd),(ins GRRegs:$rs1,GRRegs:$rs2),
    "add\t$rd, $rs1, $rs2",
    [(set GRRegs:$rd, (add GRRegs:$rs1, GRRegs:$rs2))]
    >;
    

2. Also why the dag type parameters are always named ‘ins’ and ‘outs’ just like in the following example? Can I rename them to something else?

class RType<bits<3> funct3, bits<7> funct7, dag outs, dag ins, string asmstr, list<dag> pattern>:InstToy<outs,ins,asmstr,pattern>{

    bits<7> opcode = 0b0110011;
    bits<5> rd;
    bits<5> rs1;
    bits<5> rs2;
   

    let Inst{6-0} = opcode;
    let Inst{11-7} = rd;
    let Inst{14-12} = funct3;
    let Inst{19-15} = rs1;
    let Inst{24-20} = rs2;
    let Inst{31-25} = funct7
}

3.While passing the dag type parameters, why the parameter name is also written with the value being passed?? Can I write code like this:

def ADD : RType<0b000,0b0000000,(GRRegs:$rd),(GRRegs:$rs1,GRRegs:$rs2),
    "add\t$rd, $rs1, $rs2",
    [(set GRRegs:$rd, (add GRRegs:$rs1, GRRegs:$rs2))]
    >;

instead of this:

def ADD : RType<0b000,0b0000000,(outs GRRegs:$rd),(ins GRRegs:$rs1,GRRegs:$rs2),
    "add\t$rd, $rs1, $rs2",
    [(set GRRegs:$rd, (add GRRegs:$rs1, GRRegs:$rs2))]
    >;

The dollar names or tags the argument. It is explained briefly in the TableGen Programmer’s Reference:
https://llvm.org/docs/TableGen/ProgRef.html#directed-acyclic-graphs-dags
It lets you refer to the same argument in different places, for example $rd is in the pattern and in the outs list.

No you can’t. Firstly “outs” is a standard definition in llvm/includes/Target/Target.td. Secondly a TD file works together with the TableGen backends. Although “outs” and “ins” might be defined as TableGen defs in a TD file, their behaviour comes from the particular backend being used in a TableGen run. If you search the C++ files in llvm/utils/TableGen, you find the quoted strings “ins” and “outs” being compared against. So “ins” and “outs” have hard-coded behaviour.

If you mean can you leave out the initial identifier after left parenthesis, no you can’t. That’s the operator of the dag/tree, while GRRegs etc. are its arguments.

When using TableGen, you are dealing with (1) the TableGen language, keywords, data types (2) standard definitions in TD which are always included in practice (3) behaviour built into backends in C++.

1 Like

I don’t believe this is correct. By convention the parameters tend to be called ins and outs to indicate that they are ins and outs DAG nodes, but they are just variable names here. So long as you assign In/OutOperandList to them it doesn’t matter what you call them.

1 Like

Sorry, I was thinking of the (ins …) dag. Yes I’m sure you are right about the “dag ins” parameters.