If I use __builtin_assume(N>1) then llvm knows the loop will execute and not check for (j <= 0), but I can’t seem to get it to accept N is even. Is there a way to get llvm to vectorize the loop and not generate the additional scalar loop conditions?
It looks like we’ve never added V % C == C2 in computeKnownBitsFromAssume. This would be a simple patch to add if you’re interested in fixing the compiler to handle this case.
You might also get this to work by using N & 0x1 == 0. It looks like we do handle that case. If that doesn’t work, it probably means the vectorizer isn’t asking the right questions here.
1. Since the vectorizer is using SCEV here to do the expression simplification, it is possible that SCEV is what needs to be enhanced in this case (along with, or in addition to, computeKnownBitsFromAssume) because SCEV also independently searches for dominating assumes.
2. You can tell the vectorizer that the loop is safe to vectorize, without any runtime checks, by using 'vectorize(assume_safety)' (instead of just 'vectorize(enable)').