The DAG combiner's worklist is... really, really strange.

The current documentation for it guarantees two things that are in direct tension:

  1. A node added to the worklist will be added at the top of the stack and processed next.
  2. A node will only appear once in the worklist.

The way the current implementation resolves this is to always add nodes to the worklist even if they appear more than once, but only process the node the first time it is popped off the stack. This has a serious disadvantage: it can consume an essentially unbounded amount of memory adding redundant entries to the ordering vector.

I’m curious what (if any) motivation there was for this guarantee? It would be much more memory efficient to use something more similar to the instcombine worklist. This would make a plan I have to significantly simplify adding nodes onto the worklist more tenable as a side effect of it would (likely) be to have a larger number of times nodes already on the worklist are re-added, thus ballooning the memory inefficiency in the current system.

I’m starting to work on a patch to this effect but wanted to see if there were subtleties to this contract that I should be prepared for…