question about llvm partial unrolling/runtime unrolling

Hi,

I am trying to do loop unrolling with loops that don’t have constant loop counter. It is highly appreciated if anyone can help me on this.

What I want to do is to turn

loop (n)
{

}

into

loop (n/4)
{




}

loop (n%4)
{

}

I set allowPartial and Runtime to both 1 ( llvm::createLoopUnrollPass(Threshold, count, 1, 1) )
Also overload the UnrollingPreferences structure to gives values to all the Partial* members, but the loop still doesn’t unroll.

The unrolling process hits this code in LoopUnrollRuntime.cpp

// Only unroll loops with a computable trip count and the trip count needs
// to be an int value (allowing a pointer type is a TODO item)
const SCEV *BECountSC = SE->getBackedgeTakenCount(L);
if (isa(BECountSC) ||
!BECountSC->getType()->isIntegerTy())
return false;

BECountSC=0xcccccccc and returns false here.

Based on the comments it looks like I still need a constant loop counter. Is there a way to unroll with non-constant loop counter as in the example above?

Thanks,
Frances

From: "Frances Tzeng via llvm-dev" <llvm-dev@lists.llvm.org>
To: llvm-dev@lists.llvm.org
Sent: Monday, October 12, 2015 6:13:25 PM
Subject: [llvm-dev] question about llvm partial unrolling/runtime
unrolling

Hi,

I am trying to do loop unrolling with loops that don't have constant
loop counter. It is highly appreciated if anyone can help me on
this.

What I want to do is to turn

loop (n)
{
<loop body>
}

into

loop (n/4)
{
<loop body>
<loop body>
<loop body>
<loop body>
}

loop (n%4)
{
<loop body>
}

I set allowPartial and Runtime to both 1 (
llvm::createLoopUnrollPass(Threshold, count, 1, 1) )
Also overload the UnrollingPreferences structure to gives values to
all the Partial* members, but the loop still doesn't unroll.

The unrolling process hits this code in LoopUnrollRuntime.cpp

// Only unroll loops with a computable trip count and the trip count
needs
// to be an int value (allowing a pointer type is a TODO item)
const SCEV *BECountSC = SE->getBackedgeTakenCount(L);
if (isa<SCEVCouldNotCompute>(BECountSC) ||
!BECountSC->getType()->isIntegerTy())
return false;

BECountSC=0xcccccccc and returns false here.

Based on the comments it looks like I still need a constant loop
counter. Is there a way to unroll with non-constant loop counter as
in the example above?

Computable is not the same as constant. With runtime loop unrolling enabled, you can certainly unroll a loop with a runtime trip count. If you run with -debug=loop-unroll, what does it say regarding your loop?

-Hal

Hi Hal,

I did

opt.exe -S -debug -loop-unroll -unroll-runtime=true -unroll-count=4 csShader.ll

and it prints out:

Args: opt.exe -S -debug -loop-unroll -unroll-runtime=true -unroll-count=4 csShader.ll

Loop Unroll: F[build_cs_5_0] Loop %loop_entry
Loop Size = 82
partially unrolling with count: 1

From: "Frances Tzeng" <francestzeng@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: llvm-dev@lists.llvm.org
Sent: Friday, October 16, 2015 11:19:25 AM
Subject: Re: [llvm-dev] question about llvm partial unrolling/runtime unrolling

Hi Hal,

I did

opt.exe -S -debug -loop-unroll -unroll-runtime=true -unroll-count=4
csShader.ll

and it prints out:

Args: opt.exe -S -debug -loop-unroll -unroll-runtime=true
-unroll-count=4 csShader.ll

Loop Unroll: F[build_cs_5_0] Loop %loop_entry
Loop Size = 82
partially unrolling with count: 1

Why does it compute a count of 1? If you use a debugger (and/or insert some additional print statements) around here in lib/Transforms/Scalar/LoopUnrollPass.cpp you should be able to figure it out:

  } else if (Unrolling == Runtime) {
    ...
    // Reduce unroll count to be the largest power-of-two factor of
    // the original count which satisfies the threshold limit.
    while (Count != 0 && UnrolledSize > PartialThreshold) {
      Count >>= 1;
      UnrolledSize = (LoopSize-2) * Count + 2;
    }
    if (Count > UP.MaxCount)
      Count = UP.MaxCount;
    DEBUG(dbgs() << " partially unrolling with count: " << Count << "\n");
  }

-Hal

Hi Hal,

Thanks for the response. I think I found the reason. ( not the debug message above)
My loop count is from -n to n, and it fails the “isa(TripCountSC)” check and exit.

Thanks,
Frances

Hi Frances,

Have you tried running your IR through the standard -O3 optimization pipeline? We can certainly handle such loops in general:

$ cat /tmp/l.c
void foo(float *a, int n) {
  for (int i = -n; i <= n; ++i)
    a[i] += 1;
}

$ clang -O3 -S -emit-llvm -o - /tmp/l.c -fno-vectorize -fno-unroll-loops > /tmp/l.ll

$ opt -analyze -scalar-evolution < /tmp/l.ll | grep 'backedge-taken count'
Loop %for.body: backedge-taken count is ((-1 * (sext i32 (-1 * %n) to i64))<nsw> + ((sext i32 (-1 * %n) to i64) smax (sext i32 %n to i64)))
Loop %for.body: max backedge-taken count is 4294967295

And, as you can see, we've computed an expression for the trip count. Can you figure out how your case differs from this?

-Hal