DAGCombiner rant

Yes, it was I who put that rant in the commit log and it’s justified. Worse, it’s unreasonable to actually go through all of DAGCombiner’s code and check to see if certain kinds of constants, e.g., i64, are legal during a particular phase of DAGCombiner. DAGCombiner does good work and the backends are supposed to be good citizens. CellSPU is certainly trying to be a good citizen, no matter how frustrating that becomes on certain days.

We need to document this knowledge somewhere:

(a) If you’re going to legalize operations, you should constrain yourself to using target independent nodes insofar as feasible. Think twice before adding a new pseudo-instruction: it prevents DAGCombiner optimizations and sometimes DAGcombiner will unintentionally work around your operation legalizations in unexpected ways (i.e., transforming (fabs:f64 arg) to (bitconvert:f64 (and:i64 (bitconvert:i64 arg), 0x7fffffffffffffff)))

(b) If you add new pseudo-instructions (i.e., target-specific nodes in your ISelLowering source), make sure you know how to PerformDAGCombine. If there’s no possible way to do PerformDAGCombine on your new pseudo-instructions, see (a).

(c) If at all possible, handle target-specific instruction selection, e.g., 64-bit loads or other speciality instruction selection, in your ISelDAGToDAG source. For example, CellSPU cannot lower 64-bit add, substract or multiply operations using target-independent nodes. These are custom instruction-selected using pseudo-instructions that only the instruction selector will see.

(d) If (a), (b) or (c) don’t work, then, and only then, create a new target-dependent node and its corresponding ISD-extended pseudo instruction.

Hi Scott,

I'm not clear on what you're saying here; some of your
points below seem to be contradictory. The advice to
use target-independent nodes when feasible seems
sound to me, so I wrote up a comment about it in
SelectionDAGNodes.h. If you can formulate your
thoughts in the form of specific documentation changes,
that would be helpful.

In theory, DAGCombiner is supposed to check if an operation
is legal before using it, when it is running after legalize.
It seems in practice it often doesn't do this. Our usual
response when people hit this problem is "then fix it".
Are you saying that the extent of the problem makes this
infeasible?

Dan

Sorry if I wasn’t clear, since I was venting some of my frustration with my continual battle with DAGCombine and the CellSPU backend. DAGCombine should operate according to the principle of least surprise; yet, I’m continually being surprised.

My biggest frustration, until recently, was finding new i64 constants being generated by DAGCombine after operation legalization. Prior to that, InstCombine was inserting i64 multiplications; I ended up biting the bullet and generating the full 48+ instruction sequence for i64 mul for CellSPU in terms of the i16 mulipliers that Cell’s SPU actually support.I moved all of i64 constant generation to instruction selection, when originally, i64 constants were legalized by target-specific nodes.

My approach now is the path of least pain and better citizenship within the LLVM framework itself.

Specfic guidance for backend developers:

(a) Avoid target-specific nodes.
(b) Legalize operations in terms of existing ISD instructions.
(c) Don’t be surprised if DAGCombiner is too smart for your backend’s capabilities.
(d) The size and scope of changing DAGCombiner is daunting, so if you can avoid changing or altering DAGCombiner, don’t change DAGCombiner. For example, how many changes need to be made to check of i64 constants are legal during a particular phase of DAGCombiner – you have to do a complete job, not just patch the place that just bit you in the backside.
(e) Handle as much target-specific during instruction selection (e.g., i64 constants). IFF there’s no way to handle during instruction selection, then consider adding a new target-specific node for operation legalization.
(f) If in doubt about any of the above suggestions, refer to (e) and (a).

Regarding (d), I’d rather do a complete job than a half-baked job if I were going to submit a patch against DAGCombiner. Looking at the code, even using NetBeans’ IDE, tracking down “illegal” i64 constants is a daunting task. The path of least pain was to move i64 constant generation to instruction selection and leave DAGCombiner alone. Basically, the guidance is “Do as much of what you can during instruction selection, rewrite operations as target-independent nodes insofar as feasible, and only as a last resort, add new target-specific nodes because the pain threshold for altering DAGCombiner or SelectionDAGLegalize is sufficiently intolerable.” (Yes, all one needs to look for are occurances of “DAG->getConstant(MVT::i64”, but there’s the possibility of either getting it wrong or being incomplete.)

In (e), however, most backed developers will immediately notice that they have to run a set of predicates that will be executed a second time by the generated Select() code. There’s probably a better way to do that.

Example: Cell’s SPU is a vector processor with a “unform register set”, meaning that scalars are always in slot 0 of the vector (mostly.) Unformity has to be preserved at all times. Hence, at least in CellSPU’s case, i64 constants have to be represented as BUILD_VECTOR nodes and the various selection predicates are run against the node to determine if the result is a simple constant load that propagates to all vector elements or whether a constant pool spill needs to occur (depends on the constant and avoiding a CP spill.) If a predicate succeeds, then the generated Select() method is invoked, which runs all of the predicates a second time.

Hope that clears up some of the confusion.

-scooter