Hope it transforms into the following IR. However, CSE pass cannot solve this issue.Because % cast does not dominate %cast1. memref.cast is a pure op, perhaps we can reorder it based on SSA dominance. We have had some discussions on this issue before. Will ops without side effects be reordered when running the pass?
Thanks @mehdi_amini for asking me this question.“what makes cast “special” with this property? Why not other ops?”, I think if an Op is a Pure Op, we have the opportunity to hoist its position based on SSA dominance, so perhaps we can write a more generic pass to hoist Pure op.
This looks similar to loop-invariant code motion but for conditionals. Can the two be generalized to something that works across control flow operations?
I looked into this earlier. The LICM pass is a bit simpler because it directly matches loop-like ops. In particular, it does not need DominanceInfo. So implementation-wise I’m not sure… But maybe it makes sense to put both implementations in the same file and to expose both hoistings via a single pass (with pass options).
They’re not doing the same thing: LICM is intended to reduce the repetition of these operations (dynamically) by moving them out of the loops.
What you’re proposing (moving them close to their definition) is possibly making these executed when they wouldn’t have been before (maybe even potentially moving them into a loop when they were outside before)
The example in the RFC was misleading: the implementation just checks if the operand is in a different block than the producer and relocalize the operation with the producer.
I think this idea makes sense. Introduce a new constraint: when a block has multiple successor blocks, and two or more of these successors contain the same pure operation, we hoist it.
If there are two or more edges, we hoist the same pure op to both dominate their positions. What do you think of doing this? @ftynse@mehdi_amini@matthias-springer .
The implementation of CSE uses llvm:: ScopedHashTable.
If CFG is compared to a tree. Leaf nodes can access the SSA values in the path of the root node, but they should not be able to access the SSA values of other leaf nodes.
The reason for this is that it should be a pre order traversal of moduleOp, creating a new llvm::ScopedHashTableScope when moving from one CFG edge to another.
At the leaf node, it will search for an op that is the same as the root node to the leaf node, and then replace it with the existing op.
The reason for doing this can be seen from IR above. %cast cann’t be visited by %cast1’s block. %cast1 cann’t be visited by %cast’s block.
I will conduct research on it, which may take some time (as I do not have experience using LLVM IR), But it’s quite interesting, this question is turning from a small one to an interesting one