[LV] What is the meaning of BLEND in the dump of loop-vectorize?

hi,
For the case Compiler Explorer, when I dump the LV pass(-mllvm -debug-only=loop-vectorize), I find there is an BLEND %count.1 = ir<%inc>/vp<%12> ir<%count.013>/vp<%14> ir<%count.013>/vp<%15>, but I don’t known its meaning , any help is greatly appreciated!

int check (char *mask, double *result, int n) {
    int count = 0;
    for (int i=0; i<n; i++) {
        if (mask[i] == 0 && result[i] != 2.0)
          count++;
    }
    return count;
}
  • cmd: ~/llvm-project-upstream/build/bin/clang -march=armv8.2-a+sve -O2 -mllvm -scalable-vectorization=off -mllvm -force-vector-interleave=1 check.c -S -mllvm -debug-only=loop-vectorize
<x1> vector loop: {
  vector.body:
    EMIT vp<%2> = CANONICAL-INDUCTION
    WIDEN-REDUCTION-PHI ir<%count.013> = phi ir<0>, ir<%count.1>
    vp<%4>    = SCALAR-STEPS vp<%2>, ir<0>, ir<1>
    CLONE ir<%arrayidx> = getelementptr ir<%mask>, vp<%4>
    WIDEN ir<%0> = load ir<%arrayidx>
    WIDEN ir<%cmp1> = icmp ir<%0>, ir<0>
  Successor(s): land.lhs.true

  land.lhs.true:
    CLONE ir<%arrayidx4> = getelementptr ir<%result>, vp<%4>
    WIDEN ir<%1> = load ir<%arrayidx4>, ir<%cmp1>
    WIDEN ir<%cmp5> = fcmp ir<%1>, ir<2.000000e+00>     ; result[j] == 2
  Successor(s): if.then

  if.then:
    WIDEN ir<%inc> = add ir<%count.013>, ir<1>
  Successor(s): for.inc

  for.inc:
    EMIT vp<%12> = select ir<%cmp1> ir<%cmp5> ir<false> ; mask[j]==0 ? result[j] == 2 : 0
    EMIT vp<%13> = not ir<%cmp5>                                       ; result[j] != 2
    EMIT vp<%14> = select ir<%cmp1> vp<%13> ir<false>   ; mask[j]==0 ? result[j] != 2 : 0
    EMIT vp<%15> = not ir<%cmp1>                                      ; mask[j] != 0
    BLEND %count.1 = ir<%inc>/vp<%12> ir<%count.013>/vp<%14> ir<%count.013>/vp<%15> ; ???
    EMIT vp<%17> = VF * UF +(nuw)  vp<%2>
    EMIT branch-on-count  vp<%17> vp<%1>
  No successors
}

oh, I see

the Probability of vp<%12> + vp<%14> + vp<%15>
= ( mask[j]==0 && result[j] == 2) + ( mask[j]==0 && result[j] != 2) + (mask[j]!=0)
= ( mask[j]==0 ) + (mask[j]!=0) = 100%, so it blend all the possibilities for variable %count

but Base on the source code, we can known the count++ active only when (mask[i] == 0 && result[i] != 2.0 && i<n) matched, while in the ir<%inc>/vp<%12> , vp<%12> is (mask[j]==0 && result[j] == 2), they are not equal ?

hi @fhahn Would you please give me some guidance? Thanks.

The VPlan is different with -mllvm --prefer-predicate-over-epilogue=predicate-else-scalar-epilogue

  for.inc:
    EMIT vp<%18> = select vp<%13> ir<%cmp5> ir<false> # predicate && (mask[j] == 0) ? result[j] == 2 : 0 = predicate && (mask[j] == 0) && (result[j] == 2)
    EMIT vp<%19> = not ir<%cmp5>                      # result[j] != 2
    EMIT vp<%20> = select vp<%13> vp<%19> ir<false>   # predicate && (mask[j] == 0) ? result[j] != 2 : 0 = predicate && (mask[j] == 0) && (result[j] != 2)
    EMIT vp<%21> = not ir<%cmp1>                      # mask[j] != 0
    EMIT vp<%22> = select vp<%5> vp<%21> ir<false>    # predicate ? mask[j] != 0: 0 = predicate && (mask[j] != 0)
    BLEND %count.1 = ir<%inc>/vp<%18> ir<%count.013>/vp<%20> ir<%count.013>/vp<%22> # ?
       # vp<%18> + vp<%20> + vp<%22> = predicate && (mask[j] == 0) && (result[j] == 2) + predicate && (mask[j] == 0) && (result[j] != 2) + predicate && (mask[j] != 0)
       # = predicate && (mask[j] == 0) + predicate && (mask[j] != 0) = predicate
    EMIT vp<%24> = select vp<%5> ir<%count.1> ir<%count.013> # predicate ? %count.1 : %count.013 ?
    EMIT vp<%25> = VF * UF +  vp<%4>
    EMIT vp<%26> = VF * Part +  vp<%25>
    EMIT vp<%27> = active lane mask vp<%26> <badref> # predicate_next
    EMIT vp<%28> = not vp<%27>
    EMIT branch-on-cond vp<%28>