As we known, the operand mul usual has a high cost as its pipeline is longer.
here is an assemble in AArch64 target, which has a mul in the cntd, so does the cost of cntd x8, all, mul #5 will increase compare to cnth x8 ? (see Compiler Explorer)
cntd x8, all, mul #5
add w0, w0, w8