[RFC] What are the blockers to turning on GVNSink by default?


The backstory of this topic is not related to GVN at all, so let me start from the beginning. This rather silly example is vectorized by GCC-AArch64, but not by LLVM, and this was bothering us:

void BeamFormWeights(int a[20000], int beam) {
        for (int i = 0; i < 10000; ++i) {
                if (i == beam)
                        a[2 * i] = 0;
                        a[2 * i] = 1;

While investigating why this didn’t vectorize, we found that GVN was being too eager in turning 2*i to 2*beam within the conditional. After GVN, the SCEVs of both array operations diverge wildly, and LoopAccessAnalysis fails to analyze this. I tried patching LAA initially, but my patch turned out to be a mess. Then, we analyzed what GCC does with this example, and it actually sinks the store outside the conditional, and we thought that this is perhaps the right solution to the problem. We investigated a small lesser-known pass called MergedLoadStoreMotion, which is supposed to do this sinking, but found that it was far too simplistic. Then, we stumbled across GVNSink which does what it promises, and the example is successfully vectorized with -enable-gvn-sink.

Now, I have been investigating GVNSink for a few hours, and it seems to be a profitable pass in all cases. I didn’t find any discussion online about why it wasn’t turned on by default, so I thought I’d ask here: the LLVM unit tests have a bunch of mysterious failures related to coroutines when it’s turned on, and there are a handful of very old bugs reported against it on GitHub. There have been very few non-NFC commits to GVNSink ever since it was introduced six years ago. Given these problems, a list of concrete action items would probably be too much to hope for, but some comments from people who have played with it would be helpful.


It’s probably worth a bug scrub on those issues you’ve identified. Have you tried to reproduce the issue? If they still exist and a reproducible, perhaps they MUST be fixed first. Otherwise if they’re not reproducible, perhaps it’s time to close those reports as such. Then it would perhaps make sense to enable at that point.

Have you done the git archeology to see who added -enable-gvn-sink? Have you contacted them?

Perhaps posting a link to this discussion on ⚙ D24805 [GVNSink] Initial GVNSink prototype and asking for feedback might get the right people to point you in the right direction?

Hi, I haven’t really played with it but I recently noticed that GVNSink was beneficial for code size at Oz optim level on both AOSP full build and SPEC benchmarks. Same for GVNHoist (-0.1/-0.2%)

1 Like

I had a look at the linked issues, but none of them seem relevant to gvn-sink directly (I went ahead and closed some).

But I did some runtime testing and it looks like there are some miscompiles which I am looking into getting reproducers now.

But I did some runtime testing and it looks like there are some miscompiles which I am looking into getting reproducers now.

Not sure if GVNSink was updated after opaque pointers were enabled. GVNHoist needed a bit of fix e.g.,

For the record, the motivating case has been addressed by ⚙ D156532 [Pipelines] Perform hoisting prior to GVN without GVNSink involvement.

@nikic’s patch was reverted, since it was causing regressions on SPEC (albeit, for a rather silly reason). I think we have one clear motivating example for GVNSink, but not much more. I did a quick benchmark run, and turning it on results in some serious regressions:

Program                                       master  on      diff
test-suite...xternal/Embench/statemate.test     78614   92374 17.5%
test-suite... :: External/Embench/slre.test   1723908 1786404  3.6%
test-suite...xternal/CoreMark/coremark.test   1028821 1037385  0.8%
test-suite...ternal/Embench/aha-mont64.test   2478307 2478307  0.0%
test-suite...External/Embench/wikisort.test    878540  878540  0.0%
test-suite :: External/Embench/ud.test        1163530 1163530  0.0%
test-suite... External/Embench/tarfind.test    984403  984403  0.0%
test-suite :: External/Embench/st.test        3329531 3329531  0.0%
test-suite...al/Embench/sglib-combined.test   2359553 2359553  0.0%
test-suite... External/Embench/qrduino.test   2132286 2132286  0.0%
test-suite...ternal/Embench/primecount.test   3059021 3059021  0.0%
test-suite...External/Embench/nsichneu.test   2303401 2303401  0.0%
test-suite...nal/Embench/nettle-sha256.test   1303960 1303960  0.0%
test-suite...ternal/Embench/nettle-aes.test   1729373 1729373  0.0%
test-suite...:: External/Embench/nbody.test   3043072 3043072  0.0%
test-suite...: External/Embench/minver.test    398686  398686  0.0%
test-suite...: External/Embench/md5sum.test    720967  720967  0.0%
test-suite...ernal/Embench/matmult-int.test    990512  990512  0.0%
test-suite...xternal/Embench/huffbench.test   1566315 1566315  0.0%
test-suite :: External/Embench/edn.test       1161388 1161388  0.0%
test-suite...:: External/Embench/cubic.test        44      44  0.0%
test-suite...:: External/Embench/crc32.test   2788051 2788051  0.0%
test-suite...marks/Dhrystone/dhrystone.test     98109   98109  0.0%
test-suite...External/Embench/picojpeg.test   1820298 1813086 -0.4%
Geomean difference                                             0.8%

I haven’t looked at the regression in detail yet, and am focusing on improving the test coverage (see eg. https://reviews.llvm.org/D156451 and https://reviews.llvm.org/D156458) at the moment.

We’ve been using GVNSink and GVNHoist with -Oz because it helps with binary size and haven’t encountered any issues so far. I’d be supportive of enabling those passes by default in the -Oz pipeline where the potential performance regression isn’t really a concern.

1 Like