Ok, this is really complicated. I need some TableGen experts to help here.
I need a little help doing a final bit of tblgen hacking.
I've hacked tblgen to handle patterns like this:
let AddedComplexity = 40 in {
def : Pat<(vector_shuffle (v2f64 (scalar_to_vector (loadf64 addr:$src1))),
(v2f64 (scalar_to_vector (loadf64 addr:$src2))),
SHUFP_shuffle_mask:$sm),
(SHUFPDrri (MOVSD2PDrm addr:$src1),
(MOVSD2PDrm addr:$src2),
SHUFP_shuffle_mask:$sm)>, Requires<[HasSSE2]>;
} // AddedComplexity
I believe the problem with the tblgen in trunk is that it doesn't know how to
support patterns with two memory operands.
I've attached the code that the hacked tblgen spits out from EmitResultCode
for this pattern.
The remaining problem is that this code doesn't actually replace the two
memory operations. It generates two MOVSDs and a SHUFPD just fine but it
produces two extra MOVSD instructions.
As far as I can understand things, the problem is that the two MOVSD
instructions are generated by a recursive call to EmitResultCode. Thus isRoot
is false and the result of the call to getTargetNode is not passed to
ReplaceUses. Then when we pop back up and generate the SHUFPD we call
SelectNodeTo which only transforms the immediate node (the vector_shuffle).
It doesn't recurse to replace child nodes.
I tried hacking tblgen to call getTargetNode / ReplaceUses if any node in the
pattern has a chain by changing this line in tblgen:
bool InputHasChain = isRoot &&
NodeHasProperty(Pattern, SDNPHasChain, CGP);
to call PatternHasProperty instead. This does cause tblgen to emit
getTargetNode / ReplaceUses instead of SelectNodeTo but ReplaceUses doesn't
know how to handle a complex pattern like this. It complains about having
two many operands:
assert(From->getNumValues() == 1 && FromN.ResNo == 0 &&
"Cannot replace with this method!");
From->getNumValues() > 1 so this croaks.
So I'm going to need a little help. Either SelectionDAG::ReplaceAllUsesWith
needs to be able to handle more complex things or tblgen needs to emit
ReplaceUses after it generates the two MOVSD instructions. Perhaps something
like this:
SDOperand Ops0 = { CPTmp0001, CPTmp1001, CPTmp2001, CPTmp3001,
LSI_N00_Child0, LSI_N10_Child0, Chain10 };
SDOperand Tmp1(CurDAG->getTargetNode(Opc0, VT0, MVT::Other, Ops0, 7), 0);
ReplaceUses(SDOperand(N00.Val, 0), Tmp1);
I don't know if the parameters to that ReplaceUses call are correct, but I
think you'll get the idea.
The problem of course is that other patterns cause this kind of recursion to
match their memory operands and we _don't_ want those things calling
ReplaceUses for their memory operands because things work jsut fine for those
patterns already.
What's the right approach here? And can someone help me get the solution
correct so I can get over this final hurdle?
I think handling these kinds of complex memory access patterns will be
beneficial for LLVM. We see big speedups on some codes by applying the
pattern at the top of this message.
-Dave
ResultCode.cpp (2.16 KB)