The question in understand llvm instruction selection

I choose a simple backend llvm-leg(GitHub - frasercrmck/llvm-leg) to understand code generation in the backend, I’m trying to understand instruction selesction. So I look into LEGInstrInfo.td file. The instruction information includes dag ins, dag outs, string asmstr and list pattern(i can’t understand it), And i also know IR will be translated into selection DAG(a data struct) by lowering.

The key problem i want to understand is the process of instruction selection. I know tableGen will produce a MatcherTable(a kind of byte code) according to LEGInstrInfo.td, it will run a program to run instruction selection. So at first, i need to understand all of information used by tablenGen, the second, i need look into -gen-dag-isel backend to find how tableGen use these infromation.

So, my quesiton:

  1. all information struct are describe in TargetSelectionDAG.td file. i can understand
  • SDTypeConstraint – limit for value
  • SDTypeProfile – limit for Selection Node(ins and outs)
  • SDNodeProperty – property of Selection Node, actually describe operator in the value(Commutative or Associative …)
  • SDNode – include opcodeName, limit for value, property of operating …

i can’ understand

  • SDNodeXForm, CodePatPred, PatFrag(which use CodePatPred),PatLeaf,Pattern,Pat,ComplexPattern

So my question is that Is there any way for me to understand what these class abstraction is trying to express except for just looking the source code and notes?

  1. the same question, Is there a easy way for me to understand how tableGen use these information

Hi!

  • SDNodeXForm: You use this when you need to transform a node. A typically example is an instruction which loads an immediate into the upper half of an register. The predicate needs to check that the immediate is in the expected range, but in the instruction that immediate value is shifted right to save space. To do that you use a SDNodeXForm
  • PatFrags: This is a nice way to match one of several patterns. For example, you may want to match add + shl or add + mul because both patterns translate to the same register + offset address. Instead of defining 2 patterns, you create a PatFrag with both patterns, and use that in the matching pattern.
  • PatFrag: This is a specialized version of PatFrag which matches only one pattern.
  • PatLeaf: A specialized version of PatFrag with no operands. This is mainly used to match constants.
  • ComplexPattern: You may need to match a pattern which you cannot express as a pattern in TableGen. In this cases you use a ComplexPattern, which you need to implement in C++. An example is if you need to match a sequence of n add instructions, but the number n may vary. It’s often used to match addresses.
  • Pat: This defines a pattern on the DAG and the replacement instructions.

The best source for all of this stuff is to read the source. The TableGen files are heavily commented, which helps a lot.

Kind regards,
Kai

thank you bro!