induction variables

Hello LLVM,

Can you suggest a good way to use the loops and induction variable
passes to map loop exiting BasicBlocks to InductionVariables. That is,
can we use these tools to identify the loop condition.

What i have tried
Function Pass:
  foreach BB
    if(terminal is loop exit of containing loop)
      trace back to instruction producing the cond that the
      branch branches on - condProducer
      foreach(inst in loop header)
        if(isa PHI )
    if(PHI uses condProducer)
            Run InductionVariables on this PHI Loop pair,
            save result

Problems I have encountered:
Often the induction variable is stored in memory,
--basicaa --ds-aa --steens-aa -aa-eval --licm --indvars doesn't seems to
help much. in fact it worsens the rates at which induction variables
are found because of the new code inserted.
Often the branch instruction uses a value where we must go backward
through several instructions to find the "true" induction variable
name.

I guess the basis of my problem is my mapping loop exit block to PHI
nodes is too fragile for the general case. Perhaps I am doing this the
wrong way. Should I simply examine every PHI node in the loop header
and then determine if the value is eventually used in an exit block's
branch? Any help on this, or further suggestions on the memory issue
would be appreciated.

Thanks,
Dave

Can you suggest a good way to use the loops and induction variable
passes to map loop exiting BasicBlocks to InductionVariables. That is,
can we use these tools to identify the loop condition.

I can try. :slight_smile: It looks like you're running into problems because we
don't perform "Linear Function Test Replacement". This optimization would
reduce the amount of code which occurs between the induction variable and
the final comparison for exiting the loop. For example, we currently
transform this:

i = 5;
do {
  i = i + 1;
} while (i < 10);

into this:

i = 0;
do {
  i = i + 1;
} while (i+5 < 10);

... instead of into: while (i < 5);

This means the correlating the branch conditions back to the induction
variables may be fairly difficult. Also, we don't compute "trip count"
which would be useful to you as well.

What i have tried
Function Pass:
  foreach BB
    if(terminal is loop exit of containing loop)
      trace back to instruction producing the cond that the
      branch branches on - condProducer
      foreach(inst in loop header)
        if(isa PHI )
    if(PHI uses condProducer)
            Run InductionVariables on this PHI Loop pair,
            save result

That is reasonable. If it were me, I would structure it like this:

Function Pass:
  postorder traverse the loop nesting tree
    for each loop
      If the first PHI node in the header block for the loop is
         recognizable as an induction variable...
      for each exit block
        find predecessor in loop
          look at the terminator, see if it uses the indvar.

This uses the fact that the induction variable cannonicalization pass
guarantees that the cannonical induction variable will be the first phi
node in the header block, if there is one. Otherwise it's not too
different. You can use any traversal of the loop nesting tree, but
my analyses and transformations want to do the "more nested" loops before
the outer loops.

Problems I have encountered:
Often the induction variable is stored in memory,
--basicaa --ds-aa --steens-aa -aa-eval --licm --indvars doesn't seems to
help much. in fact it worsens the rates at which induction variables
are found because of the new code inserted.
Often the branch instruction uses a value where we must go backward
through several instructions to find the "true" induction variable
name.

I am definately interested in this, and I consider this to be a problem.
How are you compiling the code? Are you using the output of llvmgcc -c as
a starting point? If so, something like this should work well:

opt -reassociate -licm -indvars

If you are still seeing induction variables stuck in memory, I would be
_VERY_ interested in seeing small testcases that demonstrate this (off
list though).

I guess the basis of my problem is my mapping loop exit block to PHI
nodes is too fragile for the general case. Perhaps I am doing this the
wrong way. Should I simply examine every PHI node in the loop header
and then determine if the value is eventually used in an exit block's
branch?

As I mentioned above, this is the way I would do it. You are guaranteed
that you only have to look at the first phi node in the loop header, and
doing it this way should be more efficient (you only have to look at loop
exit blocks, not every block in the function).

That said, if you are seeing complex expressions between the exit
condition and the induction variable, perhaps linear function test
replacement would help. If so, it would probably be easier to implement
that than to put a lot of smarts into your follow-on analysis, which would
have to recognize things anyways. I suspect that what you _really_ want
to know is the trip count of the loop, and this is the right way to
compute it IMO. Others may have other opinions of course. :slight_smile:

-Chris

Vikram pointed out to me off list that Misha wrote some code to compute
the trip count of simple loops, available as the
InductionVariable::getExecutionCount() method. This code currently only
handles really simple loops (whose conditions directly use a canonical
induction variable), but it could be a good starting point.

-Chris