infer correct types from the pattern

i’m getting a

Could not infer all types in pattern!

error in my backend. it is happening on the following instruction:

VGETITEM: (set GPR:{i32:f32}:$rD, (extractelt:{i32:f32} VR:{v4i32:v4f32}:$rA, GPR:i32:$rB)).

how do i make it use appropriate types? in other words if it is f32 then use v4v32 and if it is i32 then use v4f32. i’m not sure even where to start?

any help is appreciated.

You can use a cast, and force one type in the pattern, then use the other one in a Pat:

def VGETITEM:
   [(set GPR:$rD, (extractelt (v4i32 VR:$rA), GPR:$rB))]

def: Pat<(extractelt (v4f32 VR:$rA), GPR:$rB)),
          (VGETITEM VR:$rA, GPR:$rB)>;

-Krzysztof

You can use a cast, and force one type in the pattern, then use the other
one in a Pat:

def VGETITEM:
  [(set GPR:$rD, (extractelt (v4i32 VR:$rA), GPR:$rB))]

def: Pat<(extractelt (v4f32 VR:$rA), GPR:$rB)),
         (VGETITEM VR:$rA, GPR:$rB)>;

-Krzysztof

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
llvm-dev Info Page

Krzysztof,

I'm curious how do you know LLVM so well? Most of the times your answers
are exactly what I need. I was recommended to read code (as usual), however
it is challenging without knowing what the code is trying to express. IMHO
it is better to have a concept first and then express it in code. I've been
trying to find books, tutorials, etc, but there doesn't seem to be good
examples out there. Basically my questions are:

1. What is your adivce on learning LLVM (and compiler design)?
2. Is there a way to do quickly and efficiently or I will just have to
suffer through several years of painstaking trial and error as well as my
own research on the topic?

Any help is appreciated.

That is kind of hard to answer satisfactorily. I had done compiler development for 8 years before moving on to LLVM, so the understanding of how compilers work was not a problem. The rest was essentially reading the code and writing my own. The beginnings are slow and painful, but the more information you absorb, the faster it becomes.

There are some general principles of compiler development, namely that you start having a lot of high-level information about the program structure, and then the "granularity" increases: the level of detail in the representation increases at the cost of losing the high-level information. For example, early on, loops and loop nests may be structured nicely, making them easy to optimize, but then some branches may become folded, or optimized and the CFG may no longer be so clear. So, you perform loop nest optimizations before that happens. Then you run passes that are not concerned with the high-level structures, then you run passes that look into even more details, and so on. In case of LLVM, first you have a bunch of passes that do target-independent things on the LLVM IR, then the influence of target-dependent information (like TTI) increases, then you have the selection DAG, then the DAG is legalized, then instructions are selected. After that you have MI with SSA, then register allocation begins and you have MI without SSA, then the register allocation ends and you have physical registers. Then machine functions get prolog and epilog, then the instructions are lowered to the MC layer, then that is printed (in text format, or encoded) into the output stream. Each of these stages has certain properties and the passes that run there utilize (and usually preserve) these properties. The actual details are basically only visible in the sources, but if you have a general idea about what is happening, these details will be fairly understandable.

The TableGen? That was a painstaking trial and error. :slight_smile:

-Krzysztof

That is kind of hard to answer satisfactorily. I had done compiler
development for 8 years before moving on to LLVM, so the understanding of
how compilers work was not a problem. The rest was essentially reading the
code and writing my own. The beginnings are slow and painful, but the more
information you absorb, the faster it becomes.

There are some general principles of compiler development, namely that you
start having a lot of high-level information about the program structure,
and then the "granularity" increases: the level of detail in the
representation increases at the cost of losing the high-level information.
For example, early on, loops and loop nests may be structured nicely,
making them easy to optimize, but then some branches may become folded, or
optimized and the CFG may no longer be so clear. So, you perform loop nest
optimizations before that happens. Then you run passes that are not
concerned with the high-level structures, then you run passes that look
into even more details, and so on. In case of LLVM, first you have a bunch
of passes that do target-independent things on the LLVM IR, then the
influence of target-dependent information (like TTI) increases, then you
have the selection DAG, then the DAG is legalized, then instructions are
selected. After that you have MI with SSA, then register allocation begins
and you have MI without SSA, then the register allocation ends and you have
physical registers. Then machine functions get prolog and epilog, then the
instructions are lowered to the MC layer, then that is printed (in text
format, or encoded) into the output stream. Each of these stages has
certain properties and the passes that run there utilize (and usually
preserve) these properties. The actual details are basically only visible
in the sources, but if you have a general idea about what is happening,
these details will be fairly understandable.

The TableGen? That was a painstaking trial and error. :slight_smile:

-Krzysztof

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation

Thanks!

def VGETITEM:
  [(set GPR:$rD, (extractelt (v4i32 VR:$rA), GPR:$rB))]

def: Pat<(extractelt (v4f32 VR:$rA), GPR:$rB)),
         (VGETITEM VR:$rA, GPR:$rB)>;

-Krzysztof

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
llvm-dev@lists.llvm.org
llvm-dev Info Page

What about load instruction? I tried the same approach but I got an error.

error: In anonymous_570: Type inference contradiction found, merging
'{v4i32:v4f32}' into '{i32:f32}'

For some reason I had no issues doing the same for store.

Any help is really appreciated.

Here is what I tried for load:

// Addressing modes.
def ADDRri : ComplexPattern<i32, 2, "SelectAddr", [frameindex], >;

// Address operands
def MEMri : Operand<i32> {
  let PrintMethod = "printMemOperand";
  let EncoderMethod = "getMemoryOpValue";
  let DecoderMethod = "DecodeMemoryValue";
  let MIOperandInfo = (ops GPR, i32imm);
}

class LOAD<bits<4> subop, string asmstring, list<dag> pattern>
  : InstLD<subop, (outs GPR:$rD), (ins MEMri:$src),
           !strconcat(asmstring, "\t$rD, $src"), pattern> {
  bits<5> rD;
  bits<21> src;

  let Inst{25-21} = rD;
  let Inst{20-0} = src;
}

class LOADi32<bits<4> subop, string asmstring, PatFrag opNode>
  : LOAD<subop, asmstring, [(set (i32 GPR:$rD), (opNode ADDRri:$src))]>;

let mayLoad = 1 in {
  let Itinerary = l_lwz in
    def LWZ : LOADi32<0x1, "l.lwz", load>;
}

class VLOADi32<bits<4> subop, string asmstring, PatFrag opNode>
  : VLOAD<subop, asmstring, [(set (v4i32 VR:$rD), (opNode ADDRri:$src))]>;

let mayLoad = 1 in {
  let Itinerary = v_lwz in
    def VLWZ : VLOADi32<0x2, "v.lwz", load>;
}

// Cast load of a floating point vector to use the same
// operation as a load of an integer vector.
def: Pat<(set (v4f32 VR:$rD), (load ADDRri:$src)),
         (VLWZ VR:$rD, ADDRri:$src)>;

What about load instruction? I tried the same approach but I got an error.

error: In anonymous_570: Type inference contradiction found, merging
'{v4i32:v4f32}' into '{i32:f32}'

Try changing this Pat:

// Cast load of a floating point vector to use the same
// operation as a load of an integer vector.
def: Pat<(set (v4f32 VR:$rD), (load ADDRri:$src)),
          (VLWZ VR:$rD, ADDRri:$src)>;

to

def: Pat<(v4f32 (load ADDRri:$src)),
          (VLWZ ADDRri:$src)>;

That is, remove the "set" from the input pattern and the output operand from the output pattern.

Generally, Pats only contain input operands.

If you have an error message like the one above, one easy way to locate the problem is to look at the list of all records generated by table-gen:

$ llvm-tblgen -print-records -I /path/to/llvm/lib/Target/<target> -I /path/to/llvm/lib/Target -I /path/to/llvm/include /path/to/lib/Target/<target>/<target>.td -o output_file

(This is the same invocation of table-gen as for any other purpose, except for the "-print-records" option instead of the typical -gen-... options.)

The output_file will then contain all the records that came out of the .td files, for example:

def anonymous_1405 { // Pattern Pat T_CMP_pat
   dag PatternToMatch = (i1 (seteq (i32 IntRegs:$src1), s10ImmPred:$src2));
   list<dag> ResultInstrs = [(C2_cmpeqi IntRegs:$src1, s10ImmPred:$src2)];
   list<Predicate> Predicates = ;
   int AddedComplexity = 0;
   string NAME = ?;
}

You can look for "anonymous_570" in your case and see where it came from. The comment after "def anonymous..." is a list of classes from which this definition was inherited.

-Krzysztof

Has anybody told you, you are a genius!!! Your solution worked, I really
appreciate the help. And I will keep in mind the -print records option for
the llvm-tbglen.