Multiple simplifycfg pass make some loop significantly slower

Siu_Kwan_Lam · December 26, 2016, 7:54pm

Hi all,

I am noticing a significant degradation in execution performance in loops with just one backedge than loops with two backedges. Unifying the backedges into one will also cause the slowdown.

To replicate this problem, I used the C code in https://gist.github.com/sklam/11f11a410258ca191e6f263262a4ea65 and checked against clang-3.8 and clang-4.0 nightly. Depending on where I put the “increment” code for a for-loop, I can get 2x performance difference.

The slow (but natural) version:

for (i=0; i<size; ++i) {
ai = arr[i];

if ( ai <= amin ) {
amin = ai;
all_missing = 0;
}
}

The fast version:

for (i=0; i<size;) {
ai = arr[i];
++i; // increment moved here
if ( ai <= amin ) {
amin = ai;
all_missing = 0;
}
}

With the fast version, adding a dummy line after the if-block will make the code slow again:

for (i=0; i<size;) {
ai = arr[i];
++i;
if ( ai <= amin ) {
amin = ai;
all_missing = 0;
}
i; // no effect
}

At first, I noticed the problem with any opt level >= O1. In an attempt to narrow it down, I found that using opt -simplifycfg -sroa -simplifycfg will trigger the slowdown. Removing the second simplifycfg solves it and both versions of the code run fast.

Is there a known issue for this? Or, any idea why?

Regards,
Siu Kwan Lam

Finkel_Hal_J · January 8, 2017, 10:45pm

Can you please file a bug report for this (), attaching the IR? I suspect we’ll need to look at the generated code. -Hal

Topic		Replies	Views
Making optimization passes do less LLVM Dev List Archives	2	96	May 28, 2008
clang performing worse than gcc for this loop LLVM Dev List Archives	3	71	August 31, 2020
SimplifyCFG vs loops LLVM Dev List Archives	14	74	November 21, 2012
LoopSimplify pass prevents loop unrolling LLVM Dev List Archives	6	73	July 14, 2017
A 4x slower initialization loop in LLVM vs GCC and MSVC LLVM Dev List Archives	6	103	October 3, 2020

Multiple simplifycfg pass make some loop significantly slower

Related Topics