There are a large number of places where we use pointer-identity for SDNodes through code paths which at times use ReplaceAllUsesWith. When coupled with the CSEMap which re-canonicalizes nodes at every step, this forms really bad ABA problems: we will often end up deleting a node because we build a new-but-identical node during RAUW. When this happens, the pointers cease to be valid and can lead to ABA.
The RAUW code itself and the DAG combiner fix this by using a DAG Update Listener which snoops on all node deletions and tries to update every datastructure that might be holding an iterator or pointer key that has become invalidated. But there are a lot more datastructures key-ing on ‘SDNode *’ than there are DAG update listeners, so I suspect we have latent bugs here already.
Does anyone have good ideas about how to solve this? I thought I did until I realized how fundamental the conflict is. I’ve now given up, but I wanted to raise the issue in case others are hitting it or have good ideas about how to get to a more predictable world.