BackedgeTakenCount calculation for fortran loops and DragonEgg gfortran-4.6

Hello, I'm finding problems with BackEdgeTaken count calculation in
even simple fortran loops with gfortran-4.6 + DragonEgg 3.0.

Even for simple double loops like this one:

       program test2
       integer i,j,k
       dimension k(100,100)
       do j=1,100
        do i=1,100
         k(i,j) = i
        enddo
       enddo
       write(*,*) k(1,30)
       end

make the ScalarEvolution engine return "CouldNotCompute" even for the
outer loop (the inner loop is fine).

You can find a screenshot of the translation of this loop here (with
-view-cfg Polly version):

The problem seems to be the fact that the ScalarEvolution can't
consider the outer loop exit branch instruction as the trivial case
(where one of the successors of the exit block is the loop header or
the exit block is the loop header itself) because of the (strange?)
loop shape (the exit block instead of jumping to the header of the
loop jumps instead to another block that increments the induction
variable and has an unconditional branch to the loop header) and so
starts backtracking the predecessors of the of the exit block and
stops when it reaches the inner loop that has a block without a unique
predecessor.

What do you think about this problem? This makes , for example,
difficult analyzing even simple fortran loops with Polly .
I believe the case portrayed in the picture is the same to the trivial
case (because the exit block jumps to a block with an unconditional
jump to the header of the loop), am I right?

I've written this little patch that adds this case to the trivial case :

diff --git a/lib/Analysis/ScalarEvolution.cpp b/lib/Analysis/ScalarEvolution.cpp
index daf7742..fcbaffe 100644
--- a/lib/Analysis/ScalarEvolution.cpp
+++ b/lib/Analysis/ScalarEvolution.cpp
@@ -4293,6 +4293,11 @@ ScalarEvolution::ComputeExitLimit(const Loop
*L, BasicBlock *ExitingBlock) {
   //
   // FIXME: we should be able to handle switch instructions (with a
single exit)
   BranchInst *ExitBr = dyn_cast<BranchInst>(ExitingBlock->getTerminator());
+ BranchInst* BrFirstSucc = dyn_cast<BranchInst>(ExitBr->
+ getSuccessor(0)->getTerminator());
+ BranchInst* BrSecondSucc = dyn_cast<BranchInst>(ExitBr->
+ getSuccessor(1)->getTerminator());

Mmm, sorry, the patch I posted crashes if ExitBr is null (which it may
be ...) , this one should be ok (and passess all the ScalarEvolution
tests in LLVM):

diff --git a/lib/Analysis/ScalarEvolution.cpp b/lib/Analysis/ScalarEvolution.cpp
index daf7742..b10fab2 100644
--- a/lib/Analysis/ScalarEvolution.cpp
+++ b/lib/Analysis/ScalarEvolution.cpp
@@ -4293,9 +4293,15 @@ ScalarEvolution::ComputeExitLimit(const Loop
*L, BasicBlock *ExitingBlock) {
   //
   // FIXME: we should be able to handle switch instructions (with a
single exit)
   BranchInst *ExitBr = dyn_cast<BranchInst>(ExitingBlock->getTerminator());

Attached

scevconcept.patch (2.34 KB)

Your patch should include a testcase, see test/Analysis/ScalarEvolution for examples. "BranchInst* " should be “BranchInst *”. You should have spaces after the // in your comments. One of the comment lines isn’t indented properly.

Nick

Well, it wasn't intended as a "real" patch to be included , but more
as a "proof of concept" for a solution. Do you think it is a valid
solution and I'm correct in my assumption? If so then I'll clean up
the patch and attach a testcase for inclusion.

Thanks!

Marcello

Well, it wasn’t intended as a “real” patch to be included , but more
as a “proof of concept” for a solution. Do you think it is a valid
solution and I’m correct in my assumption? If so then I’ll clean up
the patch and attach a testcase for inclusion.

I’m not sure – when I tried to track the IR in your screenshot through the code in SCEV I came up with a completely different reason it would return the CNC than the one you gave in your email. It would really help to have a testcase in .ll format.

Nick

This is the .ll for that graph (attached). I think I understand what
you are saying.
This particular testcase returns CNC not because the exit block
doesn't have a unique predecessor, but because the unique predecessor
(the inner loop block) has a successor that is inside the loop (in
this case itself, because it's the inner loop block).

That doesn't change, anyway, the assuption that this condition ( an
exit block that jumps to a block with an unconditional jump that jumps
to the loop header and that has as unique predecessor the exit block)
is equivalent to jumping directly to the loop header.

test2.preopt.ll (3.58 KB)

This is instead a very simple (handmade) test case that triggers the
problem (attached)
Also a more conforming patch has been attached

testcase.ll (673 Bytes)

scevconcept.patch (2.25 KB)

FInally I had the time to complete everything up. Now I included the
test case in the patch and the testcase runs with the LLVM tests
system.

Marcello,

Nick pointed this thread out to me. I've had a patch laying around for a while that should solve your problem.
See r150439. It appears to work on your test case. Can you verify that it works for you?

I'm not discounting the idea of fixing ComputeExitLimit. In fact, I'm reviewing that now. But at some point we need to acknowledge that it's the wrong approach and stop piling on CFG patterns.

-Andy