Instruction selection confusion at register - chooses vector register instead of scalar one

     I have extended the BPF back end with vector registers (inspiring from Mips MSA) - something like this:
       def MSA128D: RegisterClass<"Connex", [v128i16], 32,
                            (sequence "Wh%u", 0, 31)>;
     I also added vector store and load instructions in the style of Mips MSA - see, look for "def ST_D", etc.
     Note however that my vector unit has a separate memory space. This is why I defined the vector store like:
       class ST_DESC_BASE<string instr_asm, SDPatternOperator OpNode,
                    ValueType TyNode, RegisterOperand ROWD,
                    Operand MemOpnd = uimm4_ptr, ImmLeaf Addr = immLeafAlex,
                    InstrItinClass itin = NoItinerary> {
       dag OutOperandList = (outs);
       dag InOperandList = (ins ROWD:$wd, MemOpnd:$addrdst);
       string AsmString = !strconcat("LS[$addrdst] = $wd;",
       list<dag> Pattern = [(OpNode (TyNode ROWD:$wd), Addr:$addrdst)];
       InstrItinClass Itinerary = itin;
       string DecoderMethod = "DecodeMSA128Mem";

      Also, BPF has its own scalar stores and loads (with the standard i64 registers), for example (from
       class STOREi64<bits<2> Opc, string OpcodeStr, PatFrag OpNode>
         : STORE<Opc, OpcodeStr, [(OpNode i64:$src, ADDRri:$addr)]>;

     However, spills and loads with vector registers, created automatically at the border of basic-blocks use the scalar stores and loads and NOT the vector ones that are also defined. For example, I obtain this ASM code when compiling with my LLVM:
         std -512(r10), R(0)
         ; end of predecessor BB

         ; beginning of current BB
         ldd R(0), -512(r10)

     As we can see STOREi64 takes i64 scalar register normally, but it confuses a v128i16 register R(0) with an i64 scalar one (r0-r31)...

     Could you please tell me if there is an easy way to fix this? I guess the problem is related to the fact the vector unit has its own memory space and I guess LLVM spills normally registers on the stack - if so can I specify a different spill region for the vector register?

   Thank you,

Spills created at the end of the block (I assume you mean what fast regalloc does at -O0) are created long after instruction selection. In that case it sounds like your implementation of storeRegToStackSlot/loadRegFromStackSlot is broken


     Matt, thanks for pointing me to the methods storeRegToStackSlot and loadRegFromStackSlot of [Target]InstrInfo class. It turned out that indeed they are the ones responsible for the problem with the malformed instruction with the confusion, as discussed in the first email.

     I have one more question: can I optimize inter-block register allocation by avoiding spilling (and the associated load) at the end of a basic-block with only one successor. More exactly, I have vector.body.preheader followed by vector.body (vector.body has as successor itself and an exit block, which is empty).
     Can I tell LLVM to do register allocation s.t. it avoids spilling at the end of vector.body.preheader and avoids performing corresponding loads at the beginning of vector.body?

   Thank you,