Finding the start SDNode of a call sequence


I’m working on a bug raised from ScheduleDAGRRList, where the scheduler failed to find the correct callseq_start node for a given callseq_end. Here is part of the DAG:

  t0: ch = EntryToken
# First function call
  t28: i32,ch,glue = CALLSEQ_START ... t0
    t31: i8,ch = MOV32 ... t28:1
    t34: i8,ch = MOV32 ... t28:1
  t35: ch = TokenFactor t31:1, t34:1
  t37: ch,glue = CALL TargetExternalSymbol:i32'__subsf3' ... t35
  t38: i32,ch,glue = CALLSEQ_END ... t37

# Second function call
        t39: i32,ch,glue = CopyFromReg t38:1, ...
      t40: i8,ch = MOV32 ... t39, t35
    t10: i32,ch,glue = CALLSEQ_START ... t40:1
  t13: ch,glue = CALL TargetGlobalAddress:i32<float ()* @vectoyaw> ... t10:1
  t14: i32,ch,glue = CALLSEQ_END ... t13

# Third function call
  t42: i32,ch,glue = CALLSEQ_START ... t0
    t53: ch = TokenFactor t14:1, t42:1
  t44: i8,ch = MOV32 ... t53
    t47: ch = TokenFactor t44:1, ...
  t48: ch,glue = CALL TargetExternalSymbol:i32'__subsf3' ... t47
  t49: i32,ch,glue = CALLSEQ_END ... t48

The MOV32 instructions are used for passing parameters.

The scheduler tried to find the matching CALLSEQ_START for t49, by chasing the chain all the way up. Upon encountering a TokenFactor it chooses the chain that has the deepest level of CALLSEQ_END / CALLSEQ_START nesting.

Ideally, it would find the correct CALLSEQ_START, t42, but instead, it found t28 with the following trace:

t49 (END of #3 call) -> t48 -> ...
-> t53 -> t14 (END of #2 call) -> t13 -> t10 (START of #2 call)
-> t40 -> t35 -> t31 -> t28 (START of #1 call)

Basically the scheduler mistakenly thought t28, the start of the first call, matched with t49, the end of the third call.

I think the culprit is the traversal from t40 to its predecessors, t39 and t35: t39 was completely ignored because it’s not even a chain, even though (theoretically) it can lead to the desired CALLSEQ_END for the first call (i.e. t38) which can match with t28.

And actually, even we correctly match the CALLSEQ_END/START for the first call, on t35, it expects either of the predecessor chains (t31, t34) to “close” the still-dangling CALLSEQ_END node for the third call (i.e. t49) but obviously none of them can do that because t31 and t34 only lead to the EntryToken. Thus it will throw an assertion error.

There is a comment, left by the original author suggested that we should just have another operand on CALLSEQ_END pointing to its CALLSEQ_START rather than having this complicate chain-chasing logic. While I’m definitely happy to add this feature if the community think it is the right way to go, right now I’m wondering if there is any way, like generating function calls in another form, that is more favorable to the scheduler to mitigate this issue?

1 Like