scalar-evolution + indvars fail to get the loop trip count?

Hi,

Seems pass scalar-evolution+indvars fail to get the loop trip count of the following case:

int foo(int x, int y, int lam[256], int alp[256]) {
int i;
int z = y;

for (i = 255; i >= 0; i–) {
z += x;
lam[i] = alp[i];
}
return z;
}

The final optimized ll code is :

define i32 @foo(i32 %x, i32 %y, i32* %lam, i32* %alp) nounwind {
entry:
br label %bb

bb: ; preds = %bb, %entry
%indvar = phi i32 [ 0, %entry ], [ %indvar.next, %bb ] ; [#uses=4]
%i.0.reg2mem.0 = sub i32 255, %indvar ; [#uses=2]
%tmp7 = getelementptr i32* %alp, i32 %i.0.reg2mem.0 ; <i32*> [#uses=1]
%tmp8 = load i32* %tmp7, align 4 ; [#uses=1]
%tmp11 = getelementptr i32* %lam, i32 %i.0.reg2mem.0 ; <i32*> [#uses=1]
store i32 %tmp8, i32* %tmp11, align 4
%tmp13 = sub i32 254, %indvar ; [#uses=1]
%tmp16 = icmp slt i32 %tmp13, 0 ; [#uses=1]
%indvar.next = add i32 %indvar, 1 ; [#uses=1]
br i1 %tmp16, label %bb17, label %bb

bb17: ; preds = %bb
%indvar.lcssa = phi i32 [ %indvar, %bb ] ; [#uses=1]
%tmp28 = mul i32 %indvar.lcssa, %x ; [#uses=1]
%z.0.reg2mem.0 = add i32 %y, %x ; [#uses=1]
%tmp4 = add i32 %z.0.reg2mem.0, %tmp28 ; [#uses=1]
ret i32 %tmp4
}

Surely the loop trip count is 256, but the Loop::getTripCount() will return NULL.
This is due to the exit condition:

%tmp16 = icmp slt i32 %tmp13, 0 ; [#uses=1]

pass indvars should transform it into a canonical exit condtion EQ or NE, but as scalarevolution returned NonComputedItCount, it failed to do that.

In pass ScalarEvolution three algorithms failed to compute the trip count:

  1. In SCEVAddRecExpr::getNumIterationsInRange(), as the loop stride is negative, the exit value expression will be zero: (Here A is the stride value)

// The exit value should be (End+A)/A.
APInt ExitVal = (End + A).udiv(A);
ConstantInt *ExitValue = ConstantInt::get(ExitVal);

  1. In Line 1953, the switch-case computes the trip count due to different Cond code.
    But in this case, as it exits on true, the slt is converted to sge, not switch-case for this condtion.

  2. As the loop count is over 100, the Brute-Force algorithm also doent work

Can we improve this?

Thanks,
Sheng.

Zhou Sheng wrote:

Hi,

Seems pass scalar-evolution+indvars fail to get the loop trip count of the following case:

int foo(int x, int y, int lam[256], int alp[256]) {
  int i;
  int z = y;

  for (i = 255; i >= 0; i--) {
    z += x;
    lam[i] = alp[i];
  }
  return z;
}

The final optimized ll code is :

define i32 @foo(i32 %x, i32 %y, i32* %lam, i32* %alp) nounwind {
entry:
        br label %bb

bb: ; preds = %bb, %entry
        %indvar = phi i32 [ 0, %entry ], [ %indvar.next, %bb ] ; <i32> [#uses=4]
        %i.0.reg2mem.0 = sub i32 255, %indvar ; <i32> [#uses=2]
        %tmp7 = getelementptr i32* %alp, i32 %i.0.reg2mem.0 ; <i32*> [#uses=1]
        %tmp8 = load i32* %tmp7, align 4 ; <i32> [#uses=1]
        %tmp11 = getelementptr i32* %lam, i32 %i.0.reg2mem.0 ; <i32*> [#uses=1]
        store i32 %tmp8, i32* %tmp11, align 4
        %tmp13 = sub i32 254, %indvar ; <i32> [#uses=1]
        %tmp16 = icmp slt i32 %tmp13, 0 ; <i1> [#uses=1]
        %indvar.next = add i32 %indvar, 1 ; <i32> [#uses=1]
        br i1 %tmp16, label %bb17, label %bb

bb17: ; preds = %bb
        %indvar.lcssa = phi i32 [ %indvar, %bb ] ; <i32> [#uses=1]
        %tmp28 = mul i32 %indvar.lcssa, %x ; <i32> [#uses=1]
        %z.0.reg2mem.0 = add i32 %y, %x ; <i32> [#uses=1]
        %tmp4 = add i32 %z.0.reg2mem.0, %tmp28 ; <i32> [#uses=1]
        ret i32 %tmp4
}

Having the final .ll file doesn't help debug this. If you run opt -analyze -scalar-evolution on the .ll you pasted, it will correctly print out the loop trip count.

I've modified llvm-gcc to remove all the passes after indvars.

Surely the loop trip count is 256, but the Loop::getTripCount() will return NULL.
This is due to the exit condition:

%tmp16 = icmp slt i32 %tmp13, 0 ; <i1> [#uses=1]

pass indvars should transform it into a canonical exit condtion EQ or NE, but as scalarevolution returned NonComputedItCount, it failed to do that.

In pass ScalarEvolution three algorithms failed to compute the trip count:

1. In SCEVAddRecExpr::getNumIterationsInRange(), as the loop stride is negative, the exit value expression will be zero: (Here A is the stride value)

        // The exit value should be (End+A)/A.
      APInt ExitVal = (End + A).udiv(A);
      ConstantInt *ExitValue = ConstantInt::get(ExitVal);

Yes, that code path will fail for this example.

2. In Line 1953, the switch-case computes the trip count due to different Cond code.
    But in this case, as it exits on true, the slt is converted to sge, not switch-case for this condtion.

Not sure what you mean. The problem is that HowManyLessThans was refusing to work on a signed less-than or equal to, even in the trivial case (just because I haven't yet gotten around to working on the complex case of a non-unit stride).

3. As the loop count is over 100, the Brute-Force algorithm also doent work

Can we improve this?

Done. Committed in r60748.

Nick Lewycky