TL;DR: Legalization introduces 2 glued nodes, combine2 creates a pseudo-cycle because the glue edge is only unidirectional, the scheduler merges the glued nodes ==> cycle between SUnits ==> Assertion triggers.
Background: I’ve been experiment with Rust & AVR and ran into the following issue. The original source was Rust, the IR of the reduced test  case was generated by LLVM@341010  with some Rust specific  and an unrelated local patch . The reduced test case reproduces the problem on LLVM HEAD.
To reproduce: Run
llc -O1 reduced.ll .
What happens: The assertion
(Node2Index[SU.NodeNum] > Node2Index[PD.getSUnit()->NodeNum] && "Wrong topological sorting" triggers in
Consider the graphs before  and after  combine2:
t29 are glued together and have been created by
DAGCombiner::CombineToPostIndexedLoadStore, LLVM considers
t35 and decides that they can be combined to
t40. This creates a “pseudo-cycle”: If the glued nodes
t29 were combined (or the glue edge reversed) there would be an actual cycle.
A comment in
CombineToPostIndexedLoadStore states that “Op must be independent of N, i.e. Op is neither a predecessor nor a successor of N. Otherwise, if Op is folded that would create a cycle”, which is, of course, correct. To verify the two nodes’ (Op and N) independence, LLVM checks if they are predecessors of each other. This check returns
false in the case at hand, which is correct if one only follows the edges actually present in the graph. However, the way the two glue nodes are handled later by the scheduler means that the edge between them would need to be treated as bidirectional to get the desired result. (Note that if you flip the glue edge, you have the dependency
The schedule later combines the two nodes (graph at ), leading to a cycle. That cycle then causes the assertion error when verifying the topological ordering.
Even though I only found this bug on an experimental target, I feel like the core issue is target independent.
I would appreciate your thoughts on the matter, especially regarding a potential solution.
I can see three places where to attempt a solution:
(1) Change the TypeLegalizer to no longer generate the glue.
(2) Change the Combiner to better handle glue links.
(3) Change the scheduler to not merge the glued nodes if doing so would create a cycle.
(2) looks most promising to me, so I will experiment with that solution for now, by treating glue links as bidirectional (whether to do this for all predecessor checks or only some, how to implement that most efficiently, and how to handle the existing topological ordering optimization are questions for another time).