Identify recursion in a call graph

Hi,

Is there any facility in LLVM to identify recursion in a call graph? I realize this is undecidable in the general case due to function pointers, but at least the static cases could be identified. I don't even care about whole-program recursion, just looking at a single module would suffice. But I don't see anything like this already in LLVM, so do I simply write some code to look for cycles in the call graph? Thanks,

Trevor

Hi Trevor,

Is there any facility in LLVM to identify recursion in a call graph? I
realize this is undecidable in the general case due to function
pointers, but at least the static cases could be identified. I don't
even care about whole-program recursion, just looking at a single
module would suffice. But I don't see anything like this already in
LLVM, so do I simply write some code to look for cycles in the call
graph? Thanks,

use the facilities in SCCIterator.h, or declare your pass to be a
CallGraphSCCPass in which case it will work one strongly connected
component at a time.

Ciao,

Duncan.

Is there any facility in LLVM to identify recursion in a call graph?

...

use the facilities in SCCIterator.h, or declare your pass to be a
CallGraphSCCPass in which case it will work one strongly connected
component at a time.

Converting my ModulePass to a CallGraphSCCPass doesn't seem feasible, so I'll use llvm::scc_iterator. Here's what I have so far:

bool MyModulePass::isRecursive() {
  CallGraphNode* rootNode = getAnalysis<CallGraph>().getRoot();
  for (scc_iterator<CallGraphNode*> SCCI = scc_begin(rootNode), E = scc_end(rootNode); SCCI != E; ++SCCI) {
    const std::vector<CallGraphNode*> &nextSCC = *SCCI;
    if (nextSCC.size() == 1 && SCCI.hasLoop()) {
      return true;
    }
  }
  return false;
}

This correctly identifies direct (self) recursion but fails to identify indirect recursion, such as:

void foo() { bar(); }
void bar() { foo(); }

Any suggestions on how to identify both types? Thanks,

Trevor

P.S. Is it possible to start the call graph traversal from a particular function, rather than from getRoot()? Setting rootNode to "new CallGraphNode(myFunction)" causes isRecursive to return false, even with direct recursion.

Hi Trevor,

Converting my ModulePass to a CallGraphSCCPass doesn't seem feasible, so I'll
use llvm::scc_iterator. Here's what I have so far:

bool MyModulePass::isRecursive() {
CallGraphNode* rootNode = getAnalysis<CallGraph>().getRoot();
for (scc_iterator<CallGraphNode*> SCCI = scc_begin(rootNode), E =
scc_end(rootNode); SCCI != E; ++SCCI) {
const std::vector<CallGraphNode*> &nextSCC = *SCCI;
if (nextSCC.size() == 1 && SCCI.hasLoop()) {
return true;
}
return false;
}

This correctly identifies direct (self) recursion but fails to identify indirect
recursion, such as:

void foo() { bar(); }
void bar() { foo(); }

Any suggestions on how to identify both types? Thanks,

in this case nextSCC.size() will be 2, so your logic will not consider it.

Ciao,

Duncan.

P.S. Is it possible to start the call graph traversal from a particular
function, rather than from getRoot()? Setting rootNode to "new
CallGraphNode(myFunction)" causes isRecursive to return false, even with direct
recursion.

I don't know.

Hi you basically need to find a cycles in the call graph. Do do this just search google for a graph algorithm, then make it for your problem. See http://en.wikipedia.org/wiki/Cycle_detection.

Jeff Kunkel

Also, could you write this in a separate pass, and obtain the results from getAnalysis()? I think others would find it useful to discover if a Function may be called recursively.

-Jeff Kunkel

I've modified the code so that it correctly identifies both direct and indirect recursion. I'm now trying to package it up as a patch for the LLVM trunk so that others can use it.

Your suggestion to create a new pass for the code is interesting, but I'm not sure that this feature warrants an entirely new pass. Maybe it's more appropriate to integrate with the existing CallGraph pass by adding an isRecursive method to CallGraphNode. For example:

   AU.addRequired<CallGraph>();
   ...
   CallGraphNode* rootNode = getAnalysis<CallGraph>().getRoot();
   if (rootNode->isRecursive()) { ... }

But I have no idea how to write a test case for this. It appears that LLVM is not set up for unit tests of individual API calls. Any thoughts?

Trevor

Trevor Harmon wrote:

Also, could you write this in a separate pass, and obtain the
results from getAnalysis()? I think others would find it useful to
discover if a Function may be called recursively.

I've modified the code so that it correctly identifies both direct and
indirect recursion. I'm now trying to package it up as a patch for the
LLVM trunk so that others can use it.

Your suggestion to create a new pass for the code is interesting, but
I'm not sure that this feature warrants an entirely new pass. Maybe
it's more appropriate to integrate with the existing CallGraph pass by
adding an isRecursive method to CallGraphNode. For example:

    AU.addRequired<CallGraph>();
    ...
    CallGraphNode* rootNode = getAnalysis<CallGraph>().getRoot();
    if (rootNode->isRecursive()) { ... }

But I have no idea how to write a test case for this. It appears that
LLVM is not set up for unit tests of individual API calls. Any thoughts?

The unittests/ directory contains C++ unit tests for arbitrary C++ APIs that don't fit the dejagnu model of running opt or llc over .ll files. Read through a few of them as examples, or see upstream's documentation at:

   http://code.google.com/p/googletest/wiki/Documentation

As an aside, there's a method called LoadAssembly in unittests/ExecutionEngine/JITTest.cpp that should probably be refactored somewhere more general. It's great for embedding .ll text inside a unit test.

Nick

Thanks for the tip. Attached is a patch+testcase that adds CallGraphNode::isRecursive to LLVM. Could someone with commit access please review and apply? Thanks,

Trevor

isRecursive_LLVM-2.8.patch (1.69 KB)

isRecursive_LLVM-trunk.patch (1.69 KB)

CallGraphTest.cpp (8.32 KB)

I don't have commit access, but I can review:

+ const std::vector<CallGraphNode*> &nextSCC = *SCCI;
+ if (nextSCC.size() > 1 || SCCI.hasLoop()) {
+ return true;
+ }

The "nextSCC.size() > 1" is redundant because SCCI.hasLoop() already
checks that first thing. Because of this, the nextSCC declaration can
be removed entirely.

Hi Trevor,

What is the desired behavior for code like this?

extern void bar();
void foo() {
  ...
  bar();
  ...
}

In this situation, the call graph will report that foo is not recursive. However, it *is* possible that bar has an implementation that calls foo. The call graph originally modeled this possibility explicitly, but it turns that that this makes almost all functions end up in a single big SCC. Since one of the most important clients of the CallGraph are CallGraphSCC passes, I changed the call graph to stop modeling this behavior, instead using two different abstract nodes.

How do you think should be handled? At the very least, the comments on isRecursive should mention this case.

-Chris

Sorry for the delay in my reply. I was at a conference.

It would of course be desirable to identify this pattern, but I'm not sure what effort would be involved in doing so. If it is difficult, I think it would be acceptable to define recursion as "recursion that is explicitly present in the LLVM CallGraph", not necessarily recursion that is present in the code. A documentation note to that effect should suffice, though perhaps it could be worded better? What do you think?

Trevor