I did float this by the dev list first a couple of weeks ago, didn’t receive any comments.
Ok, I didn’t see it, sorry about that.
a) Convenience for the backends. Since it benefits multiple backends (PPC and CellSPU), it’s a logical addition. I reckon the GPU efforts would also benefit.
It’s not entirely gratuitous; the rationale for adding a new node class is threefold:
I don’t see this. Adding some helper methods would have the same functionality.
And the first thing the helper method would have to check is if this SDNode is a BUILD_VECTOR node, right?
b) Where else would one encapsulate a constant splat predicate? SelectionDAG and SDNode are not good classes for constant splat detection, since it’s functionality specific to building vectors.
Eventually we need to add a ShuffleSDNode class for other reasons. Parking it on SelectionDAG or SDNode is not a good place to put it.
Ok, so that’s a subclass of BuildVectorSDNode or the ShuffleSDNode class is passed one or more BuildVectorSDNode operands (vide infra.)
c) Future work. At some point (or another), having target-specific SDNode polymorphism is an issue that has to be addressed, in light of the vector swizzling support conversation thread. One wants the benefits of what the legalization and DAG combiner phases provide, but not have to add more code for a target-specific ISD enumeration value with a generic SDNode, if that functionality is already available (and doesn’t need to be overridden.)
This should be addressed in different ways.
I didn’t go into it in the previous email, but I am strongly against several engineering issues in your patch. “Remembering” whether something is a splat or not (for example) in the node is unnecessary and a really dangerous thing to do. It is unnecessary because it can be computed on demand (as we do now) and it is dangerous, because it means that a) clients have to keep it up to date as they change the shuffle mask, and b) the nodes need to be reCSEd as the operands are changed.
I can take the result caching out, seemed to be a good idea at the time. The predicate is only called in a couple of places and I was prematurely optimizing for the case when it’s called repeatedly.
However, (b) is something that is handled elsewhere. BuildVectorSDNode is merely a container class – the constructor isn’t doing anything funky other than calling the base constructor with the appropriate SDOperand list and size. I seem to recall that calling UpdateNodeOperands is generally frowned upon, but I’m happy to look into that particular issue. But at this point, I didn’t see any place where a BUILD_VECTOR has its operands updated. Removing the result caching would avoid bugs, with which I agree.
In my opinion, the proper direction for shuffles is:
- Back out your patch.
Before I back it out, please read my comments below carefully.
- Move the functionality of “is splat” etc to method somewhere, e.g. on SDNode.
- Introduce a new ShuffleVectorSDNode that only has two SDValue operands (the two input vectors), but that also contains an array of ints in the node (not as operands).
There are two cases: a constant vector and a vector shuffle (moving two vectors around with a constant mask, which itself is a constant vector.) I’m not seeing how a splat is equivalent to a shuffle (unless the second input operand is totally undef and the mask has a magic pattern, which I think confuses more than clarifies.) On a pure vector machine, like the CellSPU, splats and shuffles are not equivalent concepts (*)
The important part of #3 is that we really want an array of ints (using -1 for undef) for the shuffle mask, not “operands”. This eliminates the nastiness we have now were we need a buildvector, and it eliminates the dance we have to prevent the build vector from being legalized, and prevent the integer operands to it from being legalized.
Agreed, and I can see why one would want to have an array of instead of operands to build the shuffle mask. The mask is generally a byte array, at least for CellSPU and PPC, but could be a bitmask (cell has one of those too.) I think you’re still left with a constant formation problem in the case of a constant vector that is separate from vector shuffles.
Incidentally, -1 is a perfectly legal value for a CellSPU shuffle mask in the byte array (actually, it’s 0b1xxxxxxxx, but that doesn’t mean it can’t or won’t occur.)
(*) To be fair, shuffles are part of i64 constant formation on CellSPU for special values of the mask.