Optimizing diamond pattern in DAGCombine

I’m trying to optimize a pattern that goes roughly as:

A
/ \

B C
\ /

D

Problem is, when A gets modified, B and C get added back to the worklist, but D doesn’t. Readding D to the worklist just create an infinite loop where one process D again and again.

Is there a proper way to make this work ?

This is a little hard to diagnose in the abstract, but it sounds like you’re having a loop along the lines of visit A → visit B/C → visit D → visit A and that at each step you’re making a real reasonable change to the DAG and returning to the original DAG.

One possible solution is to to rework the optimization to occur when visiting A not D. so the last edge in the visit loop no longer happens.

If that’s not feasible, I don’t think there’s much you can do that is not effectively preventing one of those transitions from occurring when it would cause the infinite visit loop.

HTH,

-Nirav

The root problem is that, when A gets modified, D doesn’t get added back to the worklist. I could match the pattern on A, but the problem remains: when D gets modified, A do not get added back tot he worklist.

I also considered ding several round of DAGCombine, but it is very easy to run into infinite loops, even with a fair amount of sanity checks.

You can always explicitly add D to the worklist when you make the transformation with AddToWorklist. Presuambly this was the cause for your infinite loop.

-Nirav

Explicitly re-adding a node to be processed doesn’t work, because the processing order is canonical.

I suspect the most direct resolution here is to see the problem directly. Can you post a patch?

-Nirav

You can see an instance of that problem in https://reviews.llvm.org/D32756

Depending on which order the combiner pass on the nodes, the optimization kicks in or not. I have other diamond pattern that I want to add and they all suffer in the same way.

iI still don’t fully understand the issue, but it seems like you’re doing multiple merges in a pass and the order of visitation is key.

Ideally, there’s possible a way to write the rewrites such that they are confluent but it sounds like it’s not the case. At least it seems like the output is mostly linear and you could be okay by forcing the order consideration carefully (add nodes to worklist in reverse-usage order and explicitly remove the nodes so the WorklistMap doesn’t reorder things when it run into a CSEd node)

I’d need to see the interaction with multiple patterns to say more.

HTH,

-Nirav