RFC: first steps toward CFG-level IR manipulation

Hello everyone,

I’ve been having discussions with Jakub Kuderski about the merits of having routines to think more about the shape of the CFG instead of dealing with the details of the IR. I have one such patch posted and would like to know what the larger community thinks of the idea. The patch is here:

https://reviews.llvm.org/D37575

This patch adds a routine deleteEdge() to the BasicBlock class with the intent of always removing the edge between two blocks (From, To). Under the covers the routine changes the terminator instruction in From. It does not touch PHIs or uses. I chose this approach as a first-step towards a larger way of thinking less about the IR and more about CFG-level manipulations.

The intent of such a patch is to give users, mostly in the middle end, the ability to think about CFG-level changes instead of focusing on the details of producing correct IR. The goal is to also reduce code complexity in passes to reduce bugs and simplify coverage testing.

This is a non-trivial undertaking and is probably best approached in stages. I think LLVM would benefit in the long-run from such a move and I’d appreciate feedback and suggestions from those in the community with more experience working on LLVM.

Thank you,
-Brian

So, as mentioned, i think this is a useful route to go, but i think it needs to be thought out to make it not worse time complexity than the current edge removal, which I believe is is O(N^2) if there are phi nodes and 2 predecessors, O(N) if there are phi nodes and >2 predecessors, and ~O(1) if there is a single predecessor[1]

Most passes in llvm really don’t understand how to remove an edge, and those that do disagree on what the sequence of calls and such should be to make it happen.
You can see a lot of cargo culting in the codebase.
It’s probably well past time to encapsulate that.

Right now your code has worse time complexity than the above, because it has to find the edge (or edges) too!

as mentioned on the review, there are a couple directions you could go to fix that.

  1. Make your API work on uses. I doubt it would end up useful in this case, most things are passing BB’s, not the edge use for the BB.
  2. Cache the edge lists in the BB structure (this would speed up iterators too). This would require serious testing and verification, but once it was right, should be okay. This would have some memory cost.
  3. Gradually introduce real edge lists.
  4. Other things i’m missing.

Unfortunately, i don’t think we can pay the extra O(N) you are adding for a general API, which means we need better datastructures somewhere.

[1] Just for removePredecessor, if there are phi nodes and 2 preds, it tries to delete useless phis, which make walk N uses per phi to do replacement. If there are phi nodes and > 2 preds, it must touch all phis to remove the edge from them. It’s unclear why it makes sense to delete useless phis in this code at all :slight_smile: