SCEV LoopTripCount

Hello,

I was doing some experiments with SCEV and especially the loop trip count.
Sorry for the dumb question, what is the easiest way to dump SCEV analysis results on a .bc file?

On a side note, I wanted to see if we could optimize this function:

unsigned long kernel(long w, long h, long d) {
unsigned long count = 0;
for(int i = 0; i < w; ++i)
for(int j = i; j < h; ++j)
for(int k = j; k < d; ++k)
++count;
return count;
}

into this (it might not take into account overflows, it was generated using the barvinok library):

unsigned long kernel2(long w, long h, long d) {
return
(-1 + w >= 0 && -1 - w + h >= 0 && -1 - h + d >= 0)
? (((((2 * w - 3 * ww + www) + 3 * w * h + -3 * w * hh) + ((3 * w - 3 * ww) + 6 * w * h) * d))/6)
: (-1 + w >= 0 && -1 - w + d >= 0 && h - d >= 0)
? ((((2 * w - 3 * w
w + www) + (6 * w - 3 * ww) * d + 3 * w * dd))/6)
: (-1 + h >= 0 && w - h >= 0 && -1 - h + d >= 0)
? ((((2 * h - 2 * hhh) + (3 * h + 3 * hh) * d))/6)
: (-1 + d >= 0 && h - d >= 0 && w - d >= 0)
? (((2 * d + 3 * d
d + ddd))/6)
: 0;
}

I am not sure how advanced are SCEV-based trip counts.

Best regards.

Hi,

Hello,

I was doing some experiments with SCEV and especially the loop trip count.
Sorry for the dumb question, what is the easiest way to dump SCEV analysis results on a .bc file?

opt -analyze -scalar-evolution < file.bc

On a side note, I wanted to see if we could optimize this function:

unsigned long kernel(long w, long h, long d) {
unsigned long count = 0;
for(int i = 0; i < w; ++i)
for(int j = i; j < h; ++j)
for(int k = j; k < d; ++k)
++count;
return count;
}

into this (it might not take into account overflows, it was generated using the barvinok library):

Looks like we can not, but I haven’t examined why. Overflows might indeed get into the way.

Michael

(Hit reply-all this time :slight_smile: )

I'm doing this on a old clang/llvm build (from June 29), but if I
generate IR by

clang -mllvm -unroll-threshold=0 -O3 -S -emit-llvm -fno-vectorize x.c -o x.ll

(loop unrolling and vectorization both tend to obscure trip count
computation)

then we are able to compute the trip counts in whatever loops that
remain:

Loop %for.cond8.preheader: backedge-taken count is {(-1 + %h),+,-1}<nw><%for.cond2.preheader>
Loop %for.cond8.preheader: max backedge-taken count is -1
Loop %for.cond2.preheader: backedge-taken count is (-1 + %w)
Loop %for.cond2.preheader: max backedge-taken count is -1

It isn't clear to me why indvars does not fold way all the loops out
of existence (it is "clearly profitable" in this case). I'll try to
take a closer look tonight.

-- Sanjoy

Thank you for the fast answer.

Sorry for the email crossing…