The pattern x % constant == 0 (where the constant is odd) can be optimized
more than it currently is, using some modular arithmetic trick, to:
x * inv(constant) <=u MaxValue / constant
This patch sort of does that (hopefully, I did test it and it seemed to work,
and please give me feedback on it, it's only my first try at an llvm patch),
however I ran into a problem. This trick can in theory be easily generalized
to even `constant` (let's ignore powers of two though, they can be done better
the obvious way), but to do so a bitwise rotation is needed.
But when I build it from two shifts:
Value *Mul = Builder->CreateMul(A, Builder->getInt(Multiplier));
Value *RShift = Builder->CreateLShr(Mul, Builder->getInt32(Ctz));
Value *LShift = Builder->CreateShl(Mul, Builder->getInt32(W - Ctz));
Rotated = Builder->CreateOr(RShift, LShift);
Somewhere along the line, the left shift gets merged with the multiplication,
resulting in two multiplications (and some extra stuff), and no rotation.
Having two multiplications mostly defeats this optimization, though it's still
a bit less code than what would be generated currently.
It would be really nice if it could reliably generate something like this on x86:
imul eax, edi, foo
ror eax, bar
cmp eax, baz
; use flags
How should this be handled?
(the attached patch is limited to odd divisors to avoid this problem)
divisibilitypatch (1.66 KB)