NewGVN:
Do you have any plans to work towards enabling NewGVN by default?
If I do a trivial search on GitHub issues, I see 68 issues open against NewGVN. Not all of them may be correct and fixable though. However, I’d like to understand if any work is planned to progress this further.
Improving PRE:
Following up on Manuel’s blog:
GCC’s GVN PRE implementation is based on Thomas VanDrunen’s work in the paper “Value-Based Partial Redundancy Elimination” as mentioned here. The paper computes Anticipated and available expressions. When it comes to anticipated expressions, LLVM does have some support in GVNHoist.cpp but a) It is also off by default b) I am not sure if it mimics the paper. Anticipated expressions are important for handling all cases of PRE.
I note that Manuel implemented PRE in NewGVN but it is not upstream yet. May I please know the planned timeline for this?
Also, I’d like to highlight that LLVM’s NewGVN is based on Karthik Gargi’s paper which is slightly older than GCC’s.
Currently, I’m focused on bringing the NewGVNPRE implementation up to par with GVNPRE. There are some promising results, but there are still quite a few cases where NewGVNPRE underperforms.
Regarding the open issues, as I mentioned, my priority right now is improving my prototype. So, triaging issues is on standby. On a positive note, I’m confident that quite a few of them are duplicates or have been resolved in the meantime.
I’m not too familiar with GVNHoist, but for now, the goal is to achieve parity between GVN and NewGVN. Incorporating the transformations performed by GVNHoist might be worth revisiting in the future.
I’m curious—what are your use cases or transformations of interest?
Do you think this can be handled with your prototype?
I evaluated if this can be handled in the current GVN, but unfortunately, it cannot be. PHITransAddr::translateValue() fails to translate PHI to its predecessor in this case. If the incoming values to a PHI node are pointers, then it succeeds, but in this case, it is a non-pointer case.
I believe we should bite the bullet and delete NewGVN.
Various people have sunk a lot of time into it over the years, and we’re still at a point where the pass is nowhere close to production ready, both in terms of correctness and optimization parity. The algorithm behind NewGVN is nice in theory, but not so nice in practice.
Our efforts would be better spent improving the current GVN implementation, e.g. by migrating it to use MemorySSA.
In LLPC we have seen real world cases that benefit from noticing that some phi nodes in loops are equivalent to each other, even though each phi node recursively depends on itself.
NewGVN can do this because it’s based on equivalence classes. OldGVN can’t. If we delete NewGVN, is there any prospect of regaining this kind of optimization?
I believe we should bite the bullet and delete NewGVN.
About 3-4y ago, I’d have proposed this too. Today, @ManuelJBrito 's prototype looks quite promising in terms of addressing the correctness issues and getting closer to performance parity. I tested it on our internal benchmarks and I see a fairly limited set of regressions (combination of benchmark/architecture).
Previous NewGVN attempts revolved around fixing the current pass by addressing the cases in the bugs filed, while Manuel’s approach has been a systematic understanding of how the pass works, where it will misbehave in the theoretical algorithm and how to change the conceptual approach to address that. As a result, his changes will break some of the previous assumptions (i.e. no IR changes allowed during the analysis phase, some changes will be allowed and can be rolled back), but that makes sense from the theoretical correctness perspective and is the first viable option for making this pass production ready.
I cannot answer how this will compare with the option of migrating GVN to MemorySSA. It might still ultimately be the better candidate. But I think at this stage it makes sense to give NewGVN more time and see what it can bring.
When it comes to anticipated expressions, LLVM does have some support in GVNHoist.cpp but a) It is also off by default b) I am not sure if it mimics the paper.
GVNHoist computes Anticipated values but it based on 1. DJ Graph and 2. SSAPRE and some creative data structures to compute the values efficiently. I’m not sure if GCC uses this algorithm but NewGVN likely doesn’t. GVNHoist doesn’t evaluate all fully redundant expressions but it can be easily repurposed/leveraged to do that.
EDIT: Crossed the link to DJ Graph. I used it for another implementation that was never merged upstream.