getSmallConstantTripCount function doesn't work for obvious cases

Hi,

My pass heavily relies on llvm::Loop’s getSmallConstantTripCount method. However, I found that it doesn’t work for some simple cases.

In the following example, I can get the constant trip count of the outermost loop if statement “a[l] = a[l] + 1” is there. After commenting out this line, the returned constant trip count for that loop is 0. In my pass, I traverse the nested loops starting from statement "a[i] = b[i] * b[i+1]; ", and use a for loop to traverse each parent loop.

void f(int *a, int b[], double *c, int n)
{
int i,j,k,l;
if(n>20)
a[j] = 4;
else {
for(l=0; l<200; ++l) {
a[l] = a[l] + 1; //if remove this line, the above loop’s trip count can not be obtained.
if(b[1]>30) {
c[k] = c[k] * c[k+1];
}
else {
for(k=0; k<400;++k) {
for(i=0; i<400; ++i) {
a[i] = b[i] * b[i+1]; // nested loop traversal starts here.
}
}
}
}
}
}

As long as statement “a[l] = a[l] + 1;” is in the outermost loop body, the trip count can be obtained. Is there a way to get the constant trip count of a loop in similar cases, even if the induction variable is not used in the loop body?

Thanks,
Bo

It's usually better to give IR testcases, but I'll assume you ran the
equivalent of clang -O2 over it on some 64-bit platform.

It looks like a missed optimization is causing the tail of the loop to
end up in a strange form:

for.inc29: ; preds =
%for.inc26, %if.then4
  %exitcond38 = icmp eq i32 %l.036, 200
  br i1 %exitcond38, label %if.end32, label %for.inc29.for.body_crit_edge

for.inc29.for.body_crit_edge: ; preds = %for.inc29
  %phitmp = add i32 %l.036, 1
  br label %for.body

getSmallConstantTripCount doesn't know how to deal with this, so it
fails. Depending on what you're doing, I'd suggest using SCEV, which
is a bit more powerful and can partially analyze this loop.

-Eli

Using llvm trunk, SCEV is able to analyze the trip count of all three loops:

/extra/llvm/build/Release+Asserts/bin/clang -O2 t.c -mllvm -debug-only=loop-unroll -c
Loop Unroll: F[f] Loop %for.body22
  Loop Size = 10
  Too large to fully unroll with count: 400 because size: 4000>150
  will not try to unroll partially because -unroll-allow-partial not given
Loop Unroll: F[f] Loop %for.cond20.preheader
  Loop Size = 14
  Too large to fully unroll with count: 400 because size: 5600>150
  will not try to unroll partially because -unroll-allow-partial not given
Loop Unroll: F[f] Loop %for.body
  Loop Size = 36
  Too large to fully unroll with count: 200 because size: 7200>150
  will not try to unroll partially because -unroll-allow-partial not given

...maybe you're using an older llvmm before we move to SCEV-based trip counts.

-Andy