Separate loop condition and loop body

Hi,

Is it possible to get the list of BasicBlocks building the condition of a
loop and the list of BasicBlocks that form the body? My first approach was
to iterate over the list of all Basicblocks of a loop and separate the
header as the condition and the remaining as the body of the loop. However,
this is not correct for instance in the case of a while loop in the form:
while( A & B) do
{ body }

My aim is to manipulate for loops, while and do-while loops unitary, but I
did not find an easy way to do this. I tried also to apply passes like
"Canonicalize natural loops", "Canonicalize Induction Variables" or "Natural
Loop Construction" such that the loops were represented in the same form.
Still I do not find a one-to-one link between the loop condition, body and
end, and the BasicBlocks returned by the getHeader, getLoopLatch etc.

My next strategy would be to modify the front-end, clang, and attach
metadata to indicate whether the BasicBlock is part of the condition or of
the body of the loop.

I would very much appreciate if you could suggest an easier way to
differentiate between the parts of the loop.

Thank you,
Alexandra

Is it possible to get the list of BasicBlocks building the condition of a
loop and the list of BasicBlocks that form the body?

Based on my (limited) experience with Loop and LoopInfo, this isn't possible. (But don't take my word for it.)

My aim is to manipulate for loops, while and do-while loops unitary, but I
did not find an easy way to do this.

I think the problem here is that LLVM's CFG is low-level enough such that the distinction between the loop header expression and the loop body is lost.

Why exactly do you need to identify the blocks that form the loop header expression? I'm wondering if you could find some workaround that doesn't require this...

Trevor

To me it looks like any basic block from the loop body with a
successor not in the loop body is a BB "building the condition" (an
"exit" block).
If you already have the loop body, it should be pretty easy to find
out about those nodes.

cheers,

Benoit

I assume you mean "any basic block from the loop header".

I don't think your rule will work. Consider this counterexample:

    while (j < 10 && j > 5 && j % 2 == 0) {
       j++;
    }

This will give you the attached CFG. bb1 is the loop header; bb2 and bb3 are the other loop conditionals; bb is the loop body.

Note that bb3 is part of the condition but is not a basic block from the loop header.

Trevor

cfg._Z6simplei.dot (1.96 KB)

>To me it looks like any basic block from the loop body with a
>successor not in the loop body is a BB "building the condition" (an
>"exit" block).

I assume you mean "any basic block from the loop header".

No really, loop body.

I don't think your rule will work. Consider this counterexample:

   while (j < 10 && j > 5 && j % 2 == 0) {
      j++;
   }

This will give you the attached CFG. bb1 is the loop header; bb2 and
bb3 are the other loop conditionals; bb is the loop body.

Note that bb3 is part of the condition but is not a basic block from
the loop header.

At least in the CFG/graph theory I know, the loop body is the set of
nodes which form the strongly connected component (it contains the
headers). In this case that would be {bb1, bb2, bb3, bb}, with bb1 as
header. (the header is un-ambiguous in this case since the CFG is
natural/reducible there is only one possible header for the loop).

From this set, bb1, bb2 and bb3 have exit edges, those are the

conditional blocks.

cheers,

Benoit

In that case, what does it mean for a block to be "from" the loop body?

Perhaps you meant "of" or "in" the loop body, but then I'm still confused. Consider break statements (CFG attached):

    while (j < 10 && j > 5 && j % 2 == 0) {
       j++;
       if (j == 9)
          break;
    }

In this example, block bb is in the loop body, has a successor not in the loop body, but is not building the condition. This appears to be a violation of your rule.

Trevor

cfg._Z6simplei.dot (2.13 KB)

To me it looks like any basic block from the loop body with a
successor not in the loop body is a BB "building the condition" (an
"exit" block).

Consider break statements (CFG attached):

    while (j < 10 && j > 5 && j % 2 == 0) {
       j++;
       if (j == 9)
          break;
    }

In this example, block bb is in the loop body, has a successor not in
the loop body, but is not building the condition. This appears to be a
violation of your rule.

Trevor

Hi,

I agree that not all exiting blocks are part of the condition.
Maybe your suggestion of trying a workaround would be easier. What I want is
to modify the structure of the loop by adding an extra condition before its
execution:

------ extra.cond <-------|
     > >
     > >
     v |
     >-loop.cond |
     > > >
     > v |
     > CALL(body) ___|
     >
     >---> loop.end

____>exit

This condition has to be checked in the beginning of each iteration, that
means, the loop body should branch to this extra condition instead of the
loop condition. So I want to eliminate the branch from loop.body to
loop.cond and to insert a branch from loop.body to extra.cond . Also, I
need to delimit the BBs building the body of the loop, in order to extract
them in a new function.

I hope my explanations are not too vague.
Thanks for your help.

Alexandra

>>> To me it looks like any basic block from the loop body with a
>>> successor not in the loop body is a BB "building the condition" (an
>>> "exit" block).
>>
Consider break statements (CFG attached):

    while (j < 10 && j > 5 && j % 2 == 0) {
       j++;
       if (j == 9)
          break;
    }

In this example, block bb is in the loop body, has a successor not in
the loop body, but is not building the condition. This appears to be a
violation of your rule.

In this case, the if statement could be considered part of the condition
(the exit of the loop depends on it, you can even rewrite the while
statement to include the condition). It really depends what you want to
achieve.

Hi,

I agree that not all exiting blocks are part of the condition.
Maybe your suggestion of trying a workaround would be easier. What I want is
to modify the structure of the loop by adding an extra condition before its
execution:

>------ extra.cond <-------|
> > >
> > >
> v |
> >-loop.cond |
> > > >
> > v |
> > CALL(body) ___|
> >
> >---> loop.end
>
>____>exit

This condition has to be checked in the beginning of each iteration, that
means, the loop body should branch to this extra condition instead of the
loop condition. So I want to eliminate the branch from loop.body to
loop.cond and to insert a branch from loop.body to extra.cond . Also, I
need to delimit the BBs building the body of the loop, in order to extract
them in a new function.

Can't you achieve the same thing by just adding new headers with your
extra condition.

transforming the following CFG:

      >
      v
     bb1<---+
    / | |
   / v |
   > bb2 |
   > / \ |
   > > bb3-+
   \ /
    v
loop.exit
   
into

    extra<--+
   / | |
  / v |
/ bb1 |

  / | |
/ v |
> bb2 |
> / \ |
> > bb3-+
\ /
  v
loop.exit

exit

Defining which nodes are part of the loop body is more difficult since,
at least for C language, computation and the condition can be mixed.

Cheers,

Benoit