Hi everyone! I am trying to work on the issue: [AArch64] Missed FCCMP opportunity · Issue #60819 · llvm/llvm-project · GitHub . In this issue I need to optimize the machine code generated by llvm and need to make it look like generated by gcc. But as I am newbie to llvm, so please can anyone provide me guidance for solving this issue?
I added some notes on the bug. “-debug-only=isel” is generally useful for looking at issues related to SelectionDAG.
I think for optimizing the assembly I need to write separate function something like this:
static SDValue performFloatOpt(SDNode *N, SelectionDAG &DAG) {
EVT VT = N->getValueType(0);
SDValue Cmp0 = N->getOperand(0);
SDValue Cmp1 = N->getOperand(1);
SDLoc DL(N);
SDValue Cmp, Condition;
unsigned NZCV;
if (!Cmp0.getValueType().isFloatingPoint() || !Cmp1.getValueType().isFloatingPoint())
return SDValue();
if (!Cmp0->hasOneUse() || !Cmp1->hasOneUse())
return SDValue();
Condition = DAG.getConstant(AArch64CC::VS, DL, MVT::i32);
NZCV = AArch64CC::getNZCVToSatisfyCondCode(AArch64CC::VS);
Cmp = Cmp1;
SDValue NZCVOp = DAG.getConstant(NZCV, DL, MVT::i32);
SDValue CCmp = DAG.getNode(AArch64ISD::FCCMP, DL, MVT::i32, Cmp.getOperand(0), Cmp.getOperand(1), NZCVOp, Condition);
SDValue CSel = DAG.getNode(AArch64ISD::CSINC, DL, MVT::i32, DAG.getConstant(0, DL, MVT::i32), CCmp, Condition);
return DAG.getNode(ISD::AND, DL, MVT::i32, Cmp0.getOperand(0), Cmp1.getOperand(0));
}
And I am calling this function inside performORCombine() and performANDCombine()
but don’t know why it’s not working. Can you please provide your insights into it?
Not sure what “not working” means. If you mean the optimization isn’t triggering, maybe add some debug prints to check that your code is getting called?
Not sure you’re creating the FCCMP node you’re creating is correct; I think there are normally five operands?
Yes, the optimization is not triggering. I tried adding debug messages and I think the function is not getting called because the debug message is not printing.
I don’t know exactly if there are five operands. But I think if there should be five operands only then it should give error during the build. Right?
The way the getNode() API works, the API itself doesn’t know the right number of operands for target-specific nodes, so getting it wrong would probably show up as an error at runtime during the final isel step.
Looking again, I’m not sure Cmp0.getValueType().isFloatingPoint()
is checking what you want it to (Cmp0.getValueType()
is the type of result of the compare, not the operands).
=== foo2
Initial selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 15 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t7: i1 = setcc t2, ConstantFP:f32<0.000000e+00>, setuo:ch
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t8: i1 = setcc t4, ConstantFP:f32<0.000000e+00>, setuo:ch
t9: i1 = and t7, t8
t10: i32 = any_extend t9
t11: i32 = zero_extend t9
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t11
t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
Optimized lowered selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 13 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t16: i1 = setcc t2, t2, setuo:ch
t15: i1 = setcc t4, t4, setuo:ch
t9: i1 = and t16, t15
t11: i32 = zero_extend t9
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t11
t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
Type-legalized selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 14 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t17: i32 = setcc t2, t2, setuo:ch
t18: i32 = setcc t4, t4, setuo:ch
t19: i32 = and t17, t18
t21: i32 = and t19, Constant:i32<1>
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t21
t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
Optimized type-legalized selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 12 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t17: i32 = setcc t2, t2, setuo:ch
t18: i32 = setcc t4, t4, setuo:ch
t19: i32 = and t17, t18
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t19
t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
Legalized selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 16 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t27: f32 = AArch64ISD::FCMP t2, t2
t28: i32 = AArch64ISD::CSEL Constant:i32<0>, Constant:i32<1>, Constant:i32<7>, t27
t24: f32 = AArch64ISD::FCMP t4, t4
t26: i32 = AArch64ISD::CSEL Constant:i32<0>, Constant:i32<1>, Constant:i32<7>, t24
t19: i32 = and t28, t26
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t19
t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
Optimized legalized selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 16 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t27: f32 = AArch64ISD::FCMP t2, t2
t28: i32 = AArch64ISD::CSEL Constant:i32<0>, Constant:i32<1>, Constant:i32<7>, t27
t24: f32 = AArch64ISD::FCMP t4, t4
t26: i32 = AArch64ISD::CSEL Constant:i32<0>, Constant:i32<1>, Constant:i32<7>, t24
t19: i32 = and t28, t26
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t19
t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
===== Instruction selection begins: %bb.0 'entry'
ISEL: Starting selection on root node: t14: ch = AArch64ISD::RET_FLAG t13, Register:i32 $w0, t13:1
ISEL: Starting pattern match
Morphed node: t14: ch = RET_ReallyLR Register:i32 $w0, t13, t13:1
ISEL: Match complete!
ISEL: Starting selection on root node: t13: ch,glue = CopyToReg t0, Register:i32 $w0, t19
ISEL: Starting selection on root node: t19: i32 = and t28, t26
ISEL: Starting pattern match
Initial Opcode index to 316493
Match failed at index 316497
Continuing at 316863
Match failed at index 316866
Continuing at 316911
Match failed at index 316913
Continuing at 316959
Match failed at index 316965
Continuing at 317002
Morphed node: t19: i32 = CSELWr Register:i32 $wzr, t28, TargetConstant:i32<7>, t32:1
ISEL: Match complete!
ISEL: Starting selection on root node: t28: i32 = AArch64ISD::CSEL Constant:i32<0>, Constant:i32<1>, Constant:i32<7>, t27
ISEL: Starting pattern match
Initial Opcode index to 358582
TypeSwitch[i32] from 358599 to 358602
Morphed node: t28: i32 = CSINCWr Register:i32 $wzr, Register:i32 $wzr, TargetConstant:i32<7>, t33:1
ISEL: Match complete!
ISEL: Starting selection on root node: t24: f32 = AArch64ISD::FCMP t4, t4
ISEL: Starting pattern match
Initial Opcode index to 368891
Skipped scope entry (due to false predicate) at index 368894, continuing at 368927
Match failed at index 368933
Continuing at 368948
Morphed node: t24: i32 = FCMPSrr nofpexcept t4, t4
ISEL: Match complete!
ISEL: Starting selection on root node: t27: f32 = AArch64ISD::FCMP t2, t2
ISEL: Starting pattern match
Initial Opcode index to 368891
Skipped scope entry (due to false predicate) at index 368894, continuing at 368927
Match failed at index 368933
Continuing at 368948
Morphed node: t27: i32 = FCMPSrr nofpexcept t2, t2
ISEL: Match complete!
ISEL: Starting selection on root node: t4: f32,ch = CopyFromReg t0, Register:f32 %1
ISEL: Starting selection on root node: t2: f32,ch = CopyFromReg t0, Register:f32 %0
ISEL: Starting selection on root node: t12: i32 = Register $w0
ISEL: Starting selection on root node: t3: f32 = Register %1
ISEL: Starting selection on root node: t1: f32 = Register %0
ISEL: Starting selection on root node: t0: ch,glue = EntryToken
===== Instruction selection ends:
Selected selection DAG: %bb.0 'foo2:entry'
SelectionDAG has 17 nodes:
t0: ch,glue = EntryToken
t2: f32,ch = CopyFromReg t0, Register:f32 %0
t4: f32,ch = CopyFromReg t0, Register:f32 %1
t27: i32 = FCMPSrr nofpexcept t2, t2
t33: ch,glue = CopyToReg t0, Register:f32 $nzcv, t27
t28: i32 = CSINCWr Register:i32 $wzr, Register:i32 $wzr, TargetConstant:i32<7>, t33:1
t24: i32 = FCMPSrr nofpexcept t4, t4
t32: ch,glue = CopyToReg t0, Register:f32 $nzcv, t24
t19: i32 = CSELWr Register:i32 $wzr, t28, TargetConstant:i32<7>, t32:1
t13: ch,glue = CopyToReg t0, Register:i32 $w0, t19
t14: ch = RET_ReallyLR Register:i32 $w0, t13, t13:1
Total amount of phi nodes to update: 0
*** MachineFunction at end of ISel ***
# Machine code for function foo2: IsSSA, TracksLiveness
Function Live Ins: $s0 in %0, $s1 in %1
bb.0.entry:
liveins: $s0, $s1
%1:fpr32 = COPY $s1
%0:fpr32 = COPY $s0
nofpexcept FCMPSrr %0:fpr32, %0:fpr32, implicit-def $nzcv, implicit $fpcr
%2:gpr32 = CSINCWr $wzr, $wzr, 7, implicit $nzcv
nofpexcept FCMPSrr %1:fpr32, %1:fpr32, implicit-def $nzcv, implicit $fpcr
%3:gpr32 = CSELWr $wzr, killed %2:gpr32, 7, implicit $nzcv
$w0 = COPY %3:gpr32
RET_ReallyLR implicit $w0
# End machine code for function foo2.
=== main
Initial selection DAG: %bb.0 'main:entry'
SelectionDAG has 5 nodes:
t0: ch,glue = EntryToken
t3: ch,glue = CopyToReg t0, Register:i32 $w0, Constant:i32<0>
t4: ch = AArch64ISD::RET_FLAG t3, Register:i32 $w0, t3:1
Optimized lowered selection DAG: %bb.0 'main:entry'
SelectionDAG has 5 nodes:
t0: ch,glue = EntryToken
t3: ch,glue = CopyToReg t0, Register:i32 $w0, Constant:i32<0>
t4: ch = AArch64ISD::RET_FLAG t3, Register:i32 $w0, t3:1
Type-legalized selection DAG: %bb.0 'main:entry'
SelectionDAG has 5 nodes:
t0: ch,glue = EntryToken
t3: ch,glue = CopyToReg t0, Register:i32 $w0, Constant:i32<0>
t4: ch = AArch64ISD::RET_FLAG t3, Register:i32 $w0, t3:1
Legalized selection DAG: %bb.0 'main:entry'
SelectionDAG has 5 nodes:
t0: ch,glue = EntryToken
t3: ch,glue = CopyToReg t0, Register:i32 $w0, Constant:i32<0>
t4: ch = AArch64ISD::RET_FLAG t3, Register:i32 $w0, t3:1
Optimized legalized selection DAG: %bb.0 'main:entry'
SelectionDAG has 5 nodes:
t0: ch,glue = EntryToken
t3: ch,glue = CopyToReg t0, Register:i32 $w0, Constant:i32<0>
t4: ch = AArch64ISD::RET_FLAG t3, Register:i32 $w0, t3:1
===== Instruction selection begins: %bb.0 'entry'
ISEL: Starting selection on root node: t4: ch = AArch64ISD::RET_FLAG t3, Register:i32 $w0, t3:1
ISEL: Starting pattern match
Initial Opcode index to 381846
Morphed node: t4: ch = RET_ReallyLR Register:i32 $w0, t3, t3:1
ISEL: Match complete!
ISEL: Starting selection on root node: t3: ch,glue = CopyToReg t0, Register:i32 $w0, Constant:i32<0>
ISEL: Starting selection on root node: t2: i32 = Register $w0
ISEL: Starting selection on root node: t1: i32 = Constant<0>
ISEL: Starting selection on root node: t0: ch,glue = EntryToken
===== Instruction selection ends:
Selected selection DAG: %bb.0 'main:entry'
SelectionDAG has 6 nodes:
t0: ch,glue = EntryToken
t6: i32,ch = CopyFromReg t0, Register:i32 $wzr
t3: ch,glue = CopyToReg t0, Register:i32 $w0, t6
t4: ch = RET_ReallyLR Register:i32 $w0, t3, t3:1
Total amount of phi nodes to update: 0
*** MachineFunction at end of ISel ***
# Machine code for function main: IsSSA, TracksLiveness
bb.0.entry:
%0:gpr32all = COPY $wzr
$w0 = COPY %0:gpr32all
RET_ReallyLR implicit $w0
# End machine code for function main.
This is generated while running the isel command.
Okay! If isFloatingPoint() might be creating the issue then we can also use dump() function but I’m not sure about it’s syntax.
But isFloatingPoint() is also used in emitConditionalComparison to check for floating point. Right? So I guess it should not create problem but if you can help me with the dump() syntax then I can try that too.
N->dump()
or Cmp0->dump()
should just work, I think? Dumps the node to stderr in a similar format to the debug dumps.
You can also write text to stderr using something like errs() << "ENTERING CODE SECTION\n";
. That can be helpful to see where control flow goes.
A floating-point compare has floating-point operands, but the result is an integer.
Okay! Let me try it.