Evan Cheng wrote:
Evan Cheng wrote:
Evan Cheng wrote:
I'm trying to write a backend for a target with no hardware floating
point support. I've added a single i32 register class. I'm wanting all
floating point operations to be lowered to library function calls. For
the most part LLVM seems to get this right. For example
define double @div(double %a, double %b) {
%result = fdiv double %a, %b
ret double %result
}
is expanded to a ISD::CALL of __divdf3 which is then lowered via the
LowerOperation hook of my backend.
However I run into problems with fcmp. With the following code:
define i1 @fmp(double %a) {
%result = fcmp uno double %a, 0.000000e+00
ret i1 %result
}
the fcmp is expanded to the a call to __unorddf2 which is then
lowered via the LowerOperation hook of my backend. However for some
reason
there remains a ISD::CALL node with __unorddf2 in the DAG after
legalization. This
then causes selection to fail with
Cannot yet select: 0x13b7cc0: i32,i32,ch = call 0x13b76e0, 0x13b7800,
0x13b7800, 0x13b7800, 0x13b77a0, 0x13b78f0, 0x13b79a0, 0x13b80d0,
0x13b7a00, 0x13b78f0, 0x13b79a0, 0x13b80d0, 0x13b7a00
Are there any additional steps I need to take in my target, or could
this be a bug in the Legalization phase?
This sounds like a bug in your target. Why not custom lower the f32
setcc nodes directly to the desired target nodes rather than doing
this two stage lowering?
Evan
At the moment I'm not doing any custom lowering in my target - the
lowering I was describing was what I observed the SectionDAG was doing.
I was under the impression that LLVM's soft float support meant that if
I didn't call addRegisterClass() with any FP types then floating point
operations would be expanded into libcalls and it would all just
work(tm). And for the most part it does work - addition, division, etc
on floating point types are all lower correctly by the SelectionDAG
without any further intervention.
However it fails fcmp. I was wanting to understand if this was expected
and if so what I should do about it. It sounds like I need to custom
lower the nodes directly. I would certainly be nice if this wasn't
necessary.
Ok, I am not sure I understand your original question then. Legalizer is converting the setcc node into a call to __unorddf2. Is that what you want?
Yes, this is exactly what I want
But you also stated:
he fcmp is expanded to the a call to __unorddf2 which is then
lowered via the LowerOperation hook of my backend.
Does that mean you are then lowering the call to some other operations? That means your lowering code is somehow not removing the call code. Perhaps you are not updating all the uses. Nevertheless this is not the right approach, you should instead custom lower the setcc node directly to the target specific node.
When the LowerOperation method is called with a call node I create a new
chain of operations and the return the a new operand. However, as you say,
printing the DAG shows that not all the uses are replaced.
Are you examining the DAG before you lower the ISD::CALL node?
Right before the call to LowerOperation for the ISD::CALL node the DAG looks like this:
SelectionDAG has 23 nodes:
0x97a3c78: ch = EntryToken
0x97a43e0: <multiple use>
0x97a41a0: <multiple use>
0x97a3cd8: i32 = extract_element 0x97a43e0, 0x97a41a0
0x97a3d30: ch = ArgFlags < zext orig-align:8 >
0x97a3d68: ch = ArgFlags < zext orig-align:1 >
0x97a3c78: <multiple use>
0x97a41a0: <multiple use>
0x97a4168: i32 = ExternalSymbol '__unorddf2'
0x97a3cd8: <multiple use>
0x97a3d30: <multiple use>
0x97a4428: <multiple use>
0x97a3d68: <multiple use>
0x97a3cd8: <multiple use>
0x97a3d30: <multiple use>
0x97a4428: <multiple use>
0x97a3d68: <multiple use>
0x97a3e58: i32,i32,ch = call 0x97a3c78, 0x97a41a0, 0x97a41a0, 0x97a41a0, 0x97a4168, 0x97a3cd8, 0x97a3d30, 0x97a4428, 0x97a3d68, 0x97a3cd8, 0x97a3d30, 0x97a4428, 0x97a3d68
0x97a4200: <multiple use>
0x97a3ff0: f64 = merge_values 0x97a4200, 0x97a4200
0x97a41a0: i32 = Constant <0>
0x97a3c78: <multiple use>
0x97a42a8: i32 = Register #1024
0x97a42e0: i32,ch = CopyFromReg 0x97a3c78, 0x97a42a8
0x97a3c78: <multiple use>
0x97a4350: i32 = Register #1025
0x97a4388: i32,ch = CopyFromReg 0x97a3c78, 0x97a4350
0x97a4200: f64 = build_pair 0x97a42e0, 0x97a4388
0x97a4200: <multiple use>
0x97a43e0: i64 = bit_convert 0x97a4200
0x97a43e0: <multiple use>
0x97a3da8: i32 = Constant <1>
0x97a4428: i32 = extract_element 0x97a43e0, 0x97a3da8
0x97a4490: ch = setuo
0x97a4200: <multiple use>
0x97a4490: <multiple use>
0x97a44c8: i32 = setcc 0x97a4200, 0x97a4200, 0x97a4490
0x97a3e58: <multiple use>
0x97a53f8: f64 = build_pair 0x97a3e58, 0x97a3e58:1
0x97a3c78: <multiple use>
0x97a4200: <multiple use>
0x97a4490: <multiple use>
0x97a3de8: i1 = setcc 0x97a4200, 0x97a4200, 0x97a4490
0x97a40f0: i32 = any_extend 0x97a3de8
0x97a4538: ch = ArgFlags < >
0x97a4570: ch = ret 0x97a3c78, 0x97a40f0, 0x97a4538
There is only one call to LowerOperation with a ISD::CALL node. At the end of legalization the DAG looks like this:
SelectionDAG has 32 nodes:
0x9c06c78: ch = EntryToken
0x9c073e0: <multiple use>
0x9c071a0: <multiple use>
0x9c06ed8: i32 = extract_element 0x9c073e0, 0x9c071a0
0x9c06f30: ch = ArgFlags < zext orig-align:8 >
0x9c06f68: ch = ArgFlags < zext orig-align:1 >
0x9c071a0: i32 = Constant <0>
0x9c06c78: <multiple use>
0x9c072a8: i32 = Register #1024
0x9c072e0: i32,ch = CopyFromReg 0x9c06c78, 0x9c072a8
0x9c06c78: <multiple use>
0x9c07350: i32 = Register #1025
0x9c07388: i32,ch = CopyFromReg 0x9c06c78, 0x9c07350
0x9c072e0: <multiple use>
0x9c07388: <multiple use>
0x9c07200: f64 = build_pair 0x9c072e0, 0x9c07388
0x9c073e0: i64 = bit_convert 0x9c07200
0x9c073e0: <multiple use>
0x9c06e28: i32 = Constant <1>
0x9c07428: i32 = extract_element 0x9c073e0, 0x9c06e28
0x9c08620: <multiple use>
0x9c08690: <multiple use>
0x9c07388: <multiple use>
0x9c08620: <multiple use>
0x9c08520: ch,flag = CopyToReg 0x9c08620, 0x9c08690, 0x9c07388, 0x9c08620:1
0x9c08550: i32 = Constant <4>
0x9c085e8: i32 = Register r0
0x9c06c78: <multiple use>
0x9c08550: <multiple use>
0x9c08590: ch,flag = callseq_start 0x9c06c78, 0x9c08550
0x9c085e8: <multiple use>
0x9c072e0: <multiple use>
0x9c08620: ch,flag = CopyToReg 0x9c08590, 0x9c085e8, 0x9c072e0
0x9c08690: i32 = Register r1
0x9c08720: i32 = Register r2
0x9c08520: <multiple use>
0x9c08720: <multiple use>
0x9c072e0: <multiple use>
0x9c08520: <multiple use>
0x9c08758: ch,flag = CopyToReg 0x9c08520, 0x9c08720, 0x9c072e0, 0x9c08520:1
0x9c087e0: i32 = Register r3
0x9c08758: <multiple use>
0x9c087e0: <multiple use>
0x9c07388: <multiple use>
0x9c08758: <multiple use>
0x9c08818: ch,flag = CopyToReg 0x9c08758, 0x9c087e0, 0x9c07388, 0x9c08758:1
0x9c08818: <multiple use>
0x9c088a0: i32 = TargetExternalSymbol '__unorddf2'
0x9c085e8: <multiple use>
0x9c08690: <multiple use>
0x9c08720: <multiple use>
0x9c087e0: <multiple use>
0x9c08818: <multiple use>
0x9c088d8: ch,flag = BL 0x9c08818, 0x9c088a0, 0x9c085e8, 0x9c08690, 0x9c08720, 0x9c087e0, 0x9c08818:1
0x9c088d8: <multiple use>
0x9c08550: <multiple use>
0x9c071a0: <multiple use>
0x9c088d8: <multiple use>
0x9c08998: ch,flag = callseq_end 0x9c088d8, 0x9c08550, 0x9c071a0, 0x9c088d8:1
0x9c085e8: <multiple use>
0x9c06c78: <multiple use>
0x9c071a0: <multiple use>
0x9c07168: i32 = ExternalSymbol '__unorddf2'
0x9c06ed8: <multiple use>
0x9c06f30: <multiple use>
0x9c07428: <multiple use>
0x9c06f68: <multiple use>
0x9c06ed8: <multiple use>
0x9c06f30: <multiple use>
0x9c07428: <multiple use>
0x9c06f68: <multiple use>
0x9c06fa0: i32,i32,ch = call 0x9c06c78, 0x9c071a0, 0x9c071a0, 0x9c071a0, 0x9c07168, 0x9c06ed8, 0x9c06f30, 0x9c07428, 0x9c06f68, 0x9c06ed8, 0x9c06f30, 0x9c07428, 0x9c06f68
0x9c071a0: <multiple use>
0x9c08498: ch = setne
0x9c074c8: i32 = setcc 0x9c06fa0, 0x9c071a0, 0x9c08498
0x9c08c98: ch,flag = CopyToReg 0x9c08998, 0x9c085e8, 0x9c074c8
0x9c08c98: <multiple use>
0x9c071a0: <multiple use>
0x9c08c98: <multiple use>
0x9c08d08: ch = RETSP 0x9c08c98, 0x9c071a0, 0x9c08c98:1
Where BL and RETSP are target specific nodes for a call and and return respectively.
I tracked this down to SelectionDAGLegalize::ExpandLibCall().
Here the TargetLowering's LowerCallTo is called to create the call node,
which returns a pair of operands for the chain and the result. The chain
is legalized but the result isn't. Modifying the code to call LegalizeOp on
the result seems to fix the problem I was having: the code
compiles and I see a call to __unorddf2 in the assembly.
I'm not entirely sure if this fix is correct but it seems to work so far for
my target. See the attached diff.
This seems to break the convention. It should be the responsibility of the caller to further legalize the results.
Evan
That makes sense. In that case I believe SelectionDAGLegalize::LegalizeSetCCOperands
should be legalizing the result. The description of this function says it tries to create a
legal LHS and RHS but it this case it fails to return a legal LHS. The following patch allows me to
compile my original file.
legalize.diff (335 Bytes)