Order of generated code in *GenFastISel.inc

I have been looking at adding support for the Power VSX instructions to fastisel and I noticed something that appears to be a general issue.

Most of what I need is already being generated from the instruction description table files but is coming out in the wrong order. For example, the generated code has a "bare" return statement for the basic instruction (the one with no predicate test) followed by some additional if selection code that is dead because of the bare return. Here's an example from PPCGenFastISel.inc for the FMUL instruction:

unsigned fastEmit_ISD_FMUL_MVT_f64_rr(MVT RetVT, unsigned Op0, bool Op0IsKill, unsigned Op1, bool Op1IsKill) {
   if (RetVT.SimpleTy != MVT::f64)
     return 0;
   return fastEmitInst_rr(PPC::FMUL, &PPC::F8RCRegClass, Op0, Op0IsKill, Op1, Op1IsKill);
   if ((PPCSubTarget->hasVSX())) {
     return fastEmitInst_rr(PPC::XSMULDP, &PPC::VSFRCRegClass, Op0, Op0IsKill, Op1, Op1IsKill);
   }
   return 0;
}

This doesn't just happen for Power. Here's a similar one from X86GenFastISel.inc:

unsigned fastEmit_X86ISD_VBROADCAST_MVT_v4f32_MVT_v16f32_r(unsigned Op0, bool Op0IsKill) {
   return fastEmitInst_r(X86::VBROADCASTSSZr, &X86::VR512RegClass, Op0, Op0IsKill);
   if ((Subtarget->hasAVX512())) {
     return fastEmitInst_r(X86::VBROADCASTSSZr, &X86::VR512RegClass, Op0, Op0IsKill);
   }
   return 0;
}

I investigated the code that generates the above in FastISelEmitter.cpp. The behavior is triggered by how a C++ map is being used. Maps are ordered sets of pairs and the key used in the ordering is the string value of the predicate ("hasVSX()" and "hasAVX512()" above). For basic instructions the predicate is empty and no if is generated. In C++ strings the empty string always compares as less than any other string so the code for the basic instruction (the return with no if) always comes out first. It looks like there is code to catch problems like this but it doesn't test things properly.

After discussing this with Bill Schmidt I did an experimental change in FastISelEmitter.cpp to use the instruction complexity to order the generated code pieces kind of like what is done for normal codegen. Instead of using a map I use a multimap because there are instructions with duplicate complexities. I also fixed the broken testing that is done to look for errors and some other necessary changes.

This works just fine for Power and looking through a few of the other generated .inc files doesn't appear to have broken anything obvious though the order is different.

As this is a fairly significant change I am bringing it up for discussion here.

After discussing this with Bill Schmidt I did an experimental change in
FastISelEmitter.cpp to use the instruction complexity to order the
generated code pieces kind of like what is done for normal codegen.
Instead of using a map I use a multimap because there are instructions
with duplicate complexities. I also fixed the broken testing that is
done to look for errors and some other necessary changes.

This works just fine for Power and looking through a few of the other
generated .inc files doesn’t appear to have broken anything obvious
though the order is different.

As this is a fairly significant change I am bringing it up for
discussion here.

So yeah, Bill and I were talking about this at the Dev Meeting. I’m a fan of this, and it’s probably better than saying “Put this earlier/later in the file to get the code generation you want”

-eric