I’ve had a bug/pessimization which I’ve tracked down for 1 bit bitmasks:
if (((xx) & (1ULL << (40))))
if (!((yy) & (1ULL << (40))))
The second time Constant Hoisting sees the value (1<<40) it wraps it up with a bitcast.
That value then gets hoisted. However, the first (1<<40) is not bitcast and gets recognized
as a BT. The second doesn’t get recognised because of the hoisting.
The result is some register allocation and unnecessary constant loading instructions.
There are maybe three ‘solutions’ to this problem, maybe more.
Starting with the second, in the middle of things, you could try pattern matching in
EmitTest() or LowerToBT(). I’ve tried this and it doesn’t work since it needs to reach
outside of a Selection DAG. Doesn’t work. Can’t work.
Thirdly, it’s been suggested to use a peephole pass and to look at
AArch64LoadStoreOptimizer.cpp. This also doesn’t work for pretty much the
same reason. Moreover, this is after register allocation so even for the limited
situations where it can work, it leaves allocated but unutilized registers.
Doesn’t work. In fact, I’d suggest the Arm backend adopt my approach.
So firstly, I think the best way to solve this problem is to avoid this problem
in the first place. Just don’t hoist these values.
For the X86 backend, X86TTI::getIntImmCost() in X86TargetTransformInfo.cpp
is an overridden function. Just mark these 1 bit masks there as TCC_Free:
// Don’t hoist 1 bit masks. They’ll probably be used for BT, BTS, BTC.
if (Imm.isPowerOf2()) // this could be limited to bits 32-63
This works. Its only downside is when these values are being used twice
AND then not being combined into another instruction.
I’d also recommend looking at not hoisting other values. However I haven’t really
looked this over very thoroughly.
newtst.c (1.5 KB)