Here is a last example to illustrate my concern.
The problem is about the lowering of node t13.
Initial selection DAG: BB#0 ‘_start:entry’
SelectionDAG has 44 nodes:
t11: i16 = Constant<0>
t0: ch = EntryToken
t3: ch = llvm.clp.set.rspa t0, TargetConstant:i16<392>, Constant:i32<64>
t5: ch = llvm.clp.set.rspb t3, TargetConstant:i16<393>, Constant:i32<64>
t8: ch = llvm.clp.set.rspsu t5, TargetConstant:i16<394>, Constant:i32<8>
t13: ch = store<Volatile ST4@x1> t8, ConstantFP:f32<1.000000e+00>, GlobalAddress:i16<float* @x1> 0, undef:i16
t16: ch = store<Volatile ST4@x2> t13, ConstantFP:f32<2.000000e+00>, GlobalAddress:i16<float* @x2> 0, undef:i16
t19: ch = store<Volatile ST4@x3> t16, ConstantFP:f32<3.000000e+00>, GlobalAddress:i16<float* @x3> 0, undef:i16
t22: ch = store<Volatile ST4@x4> t19, ConstantFP:f32<4.000000e+00>, GlobalAddress:i16<float* @x4> 0, undef:i16
t23: f32,ch = load<Volatile LD4@x1> t22, GlobalAddress:i16<float* @x1> 0, undef:i16
t24: f32,ch = load<Volatile LD4@x2> t23:1, GlobalAddress:i16<float* @x2> 0, undef:i16
t25: f32,ch = load<Volatile LD4@x3> t24:1, GlobalAddress:i16<float* @x3> 0, undef:i16
t26: f32,ch = load<Volatile LD4@x4> t25:1, GlobalAddress:i16<float* @x4> 0, undef:i16
t27: i16 = GlobalAddress<float (float, float, float, float)* @fdivfaddfmul_a> 0
t29: ch,glue = callseq_start t26:1, TargetConstant:i16<4>
t31: ch,glue = CLPISD::COPY_TO_CALLEE_A t29, t23, FrameIndex:i16<0>, t29:1
t33: ch,glue = CLPISD::COPY_TO_CALLEE_A t31, t24, FrameIndex:i16<1>, t31:1
t35: ch,glue = CLPISD::COPY_TO_CALLEE_A t33, t25, FrameIndex:i16<2>, t33:1
t37: ch,glue = CLPISD::COPY_TO_CALLEE_A t35, t26, FrameIndex:i16<3>, t35:1
t39: ch,glue = CLPISD::CALLSEQ t37, TargetGlobalAddress:i16<float (float, float, float, float)* @fdivfaddfmul_a> 0, t37:1
t41: ch,glue = callseq_end t39, TargetConstant:i16<4>, TargetConstant:i16<0>, t39:1
t42: f32,ch,glue = CLPISD::COPY_TO_CALLER_A t41, FrameIndex:i16<0>, t41:1
t43: ch = CLPISD::RET_FLAG t42:1
This node is first ‘combined’ into node t51 (bitcast of ConstantFP f32 to Constant i32).
Combining: t13: ch = store<Volatile ST4@x1> t8, ConstantFP:f32<1.000000e+00>, GlobalAddress:i16<float* @x1> 0, undef:i16
… into: t51: ch = store<Volatile ST4@x1> t8, Constant:i32<1065353216>, GlobalAddress:i16<float* @x1> 0, undef:i16
An a new Constant:i32 node (t50) is created.
The question is: Where in the graph is created this node?
This node seems to be created before the EntryToken !!! Before the initialization of my Stack Pointer registers (RSPA,RSPB,RSPSU).
Why isn’t it created between t8 and t51?
t0: ch = EntryToken
t3: ch = llvm.clp.set.rspa t0, TargetConstant:i16<392>, Constant:i32<64>
t5: ch = llvm.clp.set.rspb t3, TargetConstant:i16<393>, Constant:i32<64>
t8: ch = llvm.clp.set.rspsu t5, TargetConstant:i16<394>, Constant:i32<8>
t51: ch = store<Volatile ST4@x1> t8, Constant:i32<1065353216>, TargetGlobalAddress:i32<float* @x1> 0, undef:i16
t49: ch = store<Volatile ST4@x2> t51, Constant:i32<1073741824>, TargetGlobalAddress:i32<float* @x2> 0, undef:i16
t47: ch = store<Volatile ST4@x3> t49, Constant:i32<1077936128>, TargetGlobalAddress:i32<float* @x3> 0, undef:i16
t45: ch = store<Volatile ST4@x4> t47, Constant:i32<1082130432>, TargetGlobalAddress:i32<float* @x4> 0, undef:i16
t23: f32,ch = load<Volatile LD4@x1> t45, TargetGlobalAddress:i32<float* @x1> 0, undef:i16
t24: f32,ch = load<Volatile LD4@x2> t23:1, TargetGlobalAddress:i32<float* @x2> 0, undef:i16
t25: f32,ch = load<Volatile LD4@x3> t24:1, TargetGlobalAddress:i32<float* @x3> 0, undef:i16
t26: f32,ch = load<Volatile LD4@x4> t25:1, TargetGlobalAddress:i32<float* @x4> 0, undef:i16
t29: ch,glue = callseq_start t26:1, TargetConstant:i16<4>
t31: ch,glue = CLPISD::COPY_TO_CALLEE_A t29, t23, TargetFrameIndex:i16<0>, t29:1
t33: ch,glue = CLPISD::COPY_TO_CALLEE_A t31, t24, TargetFrameIndex:i16<1>, t31:1
t35: ch,glue = CLPISD::COPY_TO_CALLEE_A t33, t25, TargetFrameIndex:i16<2>, t33:1
t37: ch,glue = CLPISD::COPY_TO_CALLEE_A t35, t26, TargetFrameIndex:i16<3>, t35:1
t39: ch,glue = CLPISD::CALLSEQ t37, TargetGlobalAddress:i16<float (float, float, float, float)* @fdivfaddfmul_a> 0, t37:1
t41: ch,glue = callseq_end t39, TargetConstant:i16<4>, TargetConstant:i16<0>, t39:1
t42: f32,ch,glue = CLPISD::COPY_TO_CALLER_A t41, TargetFrameIndex:i16<0>, t41:1
t43: ch = CLPISD::RET_FLAG t42:1
ISEL: Starting pattern match on root node: t50: i32 = Constant<1065353216>
Initial Opcode index to 415
TypeSwitch[i32] from 416 to 432
Morphed node: t50: i32 = MOVSUTO_A_iSLo TargetConstant:i32<1065353216>
def : Pat<(f32 fpimm:$imm),
(MOVSUTO_A_iSLo (bitcast_fpimm_to_i32 f32:$imm))>;
def : Pat<(i32 imm:$imm),
(MOVSUTO_A_iSLo (trunc_imm i32:$imm))>;
===== Instruction selection ends:
Selected selection DAG: BB#0 ‘_start:entry’
SelectionDAG has 42 nodes:
t44: i32 = MOVSUTO_A_iSLo TargetConstant:i32<1082130432>
t46: i32 = MOVSUTO_A_iSLo TargetConstant:i32<1077936128>
t48: i32 = MOVSUTO_A_iSLo TargetConstant:i32<1073741824>
t50: i32 = MOVSUTO_A_iSLo TargetConstant:i32<1065353216>
t0: ch = EntryToken
t3: ch = MOV_SU_iSSs_rspa TargetConstant:i32<64>, t0
t5: ch = MOV_SU_iSSs_rspb TargetConstant:i32<64>, t3
t8: ch = MOV_SU_iSSs_rspsu TargetConstant:i32<8>, t5
t51: ch = MOV_A_or t50, TargetGlobalAddress:i32<float* @x1> 0, t8
t49: ch = MOV_A_or t48, TargetGlobalAddress:i32<float* @x2> 0, t51
t47: ch = MOV_A_or t46, TargetGlobalAddress:i32<float* @x3> 0, t49
t45: ch = MOV_A_or t44, TargetGlobalAddress:i32<float* @x4> 0, t47
t23: f32,ch = MOV_A_ro TargetGlobalAddress:i32<float* @x1> 0, t45
t24: f32,ch = MOV_A_ro TargetGlobalAddress:i32<float* @x2> 0, t23:1
t25: f32,ch = MOV_A_ro TargetGlobalAddress:i32<float* @x3> 0, t24:1
t26: f32,ch = MOV_A_ro TargetGlobalAddress:i32<float* @x4> 0, t25:1
t29: ch,glue = CALLSEQ_START TargetConstant:i16<4>, t26:1
t31: ch,glue = COPY_TO_CALLEE_A_FROM_GLOBAL TargetGlobalAddress:i32<float* @x1> 0, TargetFrameIndex:i16<0>, t29, t29:1
t33: ch,glue = COPY_TO_CALLEE_A_FROM_GLOBAL TargetGlobalAddress:i32<float* @x2> 0, TargetFrameIndex:i16<1>, t31, t31:1
t35: ch,glue = COPY_TO_CALLEE_A_FROM_GLOBAL TargetGlobalAddress:i32<float* @x3> 0, TargetFrameIndex:i16<2>, t33, t33:1
t37: ch,glue = COPY_TO_CALLEE_A_FROM_GLOBAL TargetGlobalAddress:i32<float* @x4> 0, TargetFrameIndex:i16<3>, t35, t35:1
t39: ch,glue = CALLSEQ TargetGlobalAddress:i16<float (float, float, float, float)* @fdivfaddfmul_a> 0, t37, t37:1
t41: ch,glue = CALLSEQ_END TargetConstant:i16<4>, TargetConstant:i16<0>, t39, t39:1
t42: f32,ch,glue = COPY_TO_CALLER_A TargetFrameIndex:i16<0>, t41, t41:1
t43: ch = RET_FLAG t42:1