SelectionDAG: target-specific simplification of generic nodes using demanded bits


The AMDGPU target has a 24-bit multiply instruction. In SimplifyDemandedBits I’d like to be able to turn a generic i32 ISD::MUL into AMDGPUISD::MUL_I24 if only the low order 24 bits are demanded. Currently it seems like there’s no way to do that, because the target hook SimplifyDemandedBitsForTargetNode is only called for target-specific nodes, not for generic nodes like MUL.

Would it be acceptable to call the target hook for generic nodes as well? Here’s a patch to show the general idea:
(It probably needs a bit of polish, e.g. the name “SimplifyDemandedBitsForTargetNode” is misleading now.)

In the future perhaps “demanded bits” could become a cached analysis that could be queried from anywhere. Then I could write a target-specific DAG combine for MUL that would query the demanded bits for the node to do this transformation.


X86 would definitely benefit from this as we often use generic opcodes as part of more complex patterns that could then be handled through SimplifyDemandedBits/SimplifyDemandedVectorElts/SimplifyMultipleUse.

Is this likely to slow compile times down?

I’ve tried but struggled to come up with an acceptable way to determine the accumulated demanded bits/elts across all users of a SDValue - I was always worried that caching would miss nodes that had recently been added/removed.

I’ve often thought it a shame that the standard combines don’t include the demandedbits/elts masks directly instead of having a separate SimplifyDemanded* mechanism.