Hi KolinHU,
I’d first like to clarify what the difference between an _UpperBound
schedule resource is compared to one that is not contain that suffix. Our vector pseudo instructions are either LMUL specific or (LMUL, SEW) specific.
For pseudo instructions that are LMUL specific, since an instruction with different LMUL may behave differently than the same instruction with a different LMUL, we create scheduling resources that are suffixed by LMUL. For example the VFMF_V_F Pseudo instructions use WriteVFMovV_MF8
, …, WriteVFMovV_M1
, … , WriteVFMovV_M8
. For pseudo instructions that are LMUL and SEW specific we do something similar, where they are suffixed by LMUL and SEW. For example, WriteVRGatherVV_M1
_E32`.
Splitting these scheduling definitions into LMUL or (LMUL, SEW) gives a subtarget the ability to fine tune the scheduling based on LMUL or (LMUL, SEW). For example, RISCVSchedSiFive7.td
gives the following definitions for WriteVFMovV (simplified for explanation). Here LMULWriteResMX
defines a "WriteVFMov_M" # mx
with ResourceCycles
depending on mx
:
foreach mx = SchedMxList in {
defvar Cycles = SiFive7GetCyclesDefault<mx>.c; // Cycles depends on LMUL
let ResourceCycles = [Cycles] in
defm "" : LMULWriteResMX<"WriteVFMovV", [SiFive7VA], mx, ...>; // Define LMUL specific "WriteVFMov_M" # mx
}
Now I’ll move onto explaining the WorstCase
suffix. Once we leave pseudo instruction world, the MachineInstr is no longer LMUL or (LMUL, SEW) specific. This becomes significant if we pass an instruction like vfmv.v.f
into llvm-mca tool. Without knowing the LMUL or SEW, we’d like llvm-mca to use the worst case behavior in its report. As a result, we assign the _WorstCase
suffixed scheduling resources to instructions that are no longer LMUL or (LMUL, SEW) specific.
The WorstCase
scheduling resources allows a subtarget flexibility in describing behavior. On some subtargets WorstCase behavior may be associated with the largest LMUL, since theres the most data to process. It may be associated with the largest LMUL and smallest SEW, since there’s the most elements to process. A subtarget may take the same amount of time for all LMUL and SEW, so every LMUL or (LMUL, SEW) has worst case behavior. Whatever it is, the subtarget has the freedom to describe how their worst case should behave.
if i want to add some new inst, how to add definiton
You’re only going to need to do this if your instruction behavior may depend on LMUL or (LMUL, SEW).
- Define
LMULSchedWrites
(or LMULSEWSchedWrites
), LMULSchedReads
(or LMULSEWSchedReads
), LMULWriteRes
(or LMULSEWWriteRes
), and LMULReadAdvance
(or LMULSEWReadAdvance
) in RISCVScheduleXXX.td
file where XXX
is the extension you are working on. You can see examples in RISCVScheduleV.td
.
- You will need to define pseudo instructions that are LMUL (or LMUL, SEW) specific that use these SchedReadWrites. You can see example in
RISCVInstrInfoVPseudos.td
.
- Then you can define the base instruction that uses the WorstCase SchedReadWrites, which you show in your question above.
- Specify attributes such as Latency or ResourceCycles in your subtarget and mark all other subtargets as
UnsupportedXXX
where XXX is the extension since they haven’t yet added scheduling information for that extension.
- You can test the LMUL or (LMUL, SEW) specific behavior and WorstCase behavior using llvm-mca lit tests