Indirect function call

The follwing is a snippet of code to find some indirect calls in a module, which I
learned from TopDownClosure.cpp:

void FPS::repairCallGraph(Module &M) {
  CompleteBUDataStructures &DS = getAnalysis<CompleteBUDataStructures>();
  for (Module::iterator f = M.begin(); f != M.end(); ++f ) {
    if( f->isExternal() ) continue;
    for (Function::iterator I = f->begin(); I != f->end(); ++I) {
      for(BasicBlock::iterator J = I->begin(); J != I->end(); ++J) {
        if(CallInst *cs = dyn_cast<CallInst>(J)) {
          Function *callee = cs->getCalledFunction();
          if(callee) continue;//not a function pointer.
          for(CompleteBUDataStructures::callee_iterator K = DS.callee_begin(J); K !=
DS.callee_end(J); ++K) {
            if(K->first != J) continue;

            CallGraphNode *cgn = getAnalysis<CallGraph>()[f]; // Find a indirect call!
            CallGraphNode *calleecgn = getAnalysis<CallGraph>()[K->second];
            cgn->addCalledFunction(calleecgn);
            std::cerr<<"\n indirect call in "<<f->getName()<<*J<<", callee:
"<<K->second->getName();
          }
        }
      }
    }
  }
}

But my code does not always works: if the arguments are not pointer,
CompleteBUDataStructures not records it. So, if you want to find all indirect
calls, you maybe have to repair CompleteBUDataStructures. :slight_smile:

If you do not use BUDataStructures, you can do it yourself: find all load/store
instructions with its destination is function type.

But my code does not always works: if the arguments are not pointer,
CompleteBUDataStructures not records it. So, if you want to find all indirect
calls, you maybe have to repair CompleteBUDataStructures. :slight_smile:

Not surprising, CBU is trying to do something entirely different that
what you are.

If you do not use BUDataStructures, you can do it yourself: find all load/store
instructions with its destination is function type.

You may want to look at how the call graph builder works. It finds all
indirect call sites, and also finds all functions whose address escapes
(this is, may be called indirectly).

Finding indirect calls is actually easy, just check if the Op(0) of the
call (or invoke) instruction !isa<Function>.

Andrew

> But my code does not always works: if the arguments are not pointer,
> CompleteBUDataStructures not records it. So, if you want to find all indirect
> calls, you maybe have to repair CompleteBUDataStructures. :slight_smile:

Not surprising, CBU is trying to do something entirely different that
what you are.

> If you do not use BUDataStructures, you can do it yourself: find all load/store
> instructions with its destination is function type.

You may want to look at how the call graph builder works. It finds all
indirect call sites, and also finds all functions whose address escapes
(this is, may be called indirectly).

  The BasicCallGraph class only lines out the indirect calls(makes the caller point to external node),
but do not resolves them using alias analysis such as DSA.
I think DSA solve this problem for interested call sites by finding the corresponding globals(i.e. the functions)
for the callsite DSnode. Maybe 夏一民 just wanted to point out that DSA does not take all callsite into count.
But just as suggested in callgraph.h, "As an extension in the future, there may be multiple nodes with a null
function. These will be used when we can prove (through pointer analysis) that an indirect call site can
call only a specific set of functions."

Maybe Chris can us give more helpful comments.

Andrew (and Dinakar, and perhaps others) are the current current maintainers of DSA.

-Chris

> Maybe Chris can us give more helpful comments.

Andrew (and Dinakar, and perhaps others) are the current current
maintainers of DSA.

Oh, I'd say sorry to you and Andrew both.
I had though you are the maintainer... ...

So I am currently thinking that maybe it is not hard to make the callgraph more accurate
in a similar way DSA deals with indirect callsites.
I am just curious about why it is not already in LLVM.
I think identifying the range of an indirect call is a worthwhile job,
for example if we want to track the correctness of a program's control flow
with the possibility that it is being attacked through buffer overflow.

Sure. More precise call graph analysis can benefit many clients. This is why the CallGraph interface is an abstract one that can be implemented with many different algorithms. If you'd like to work on a new implementation, that would be great.

-Chris

I think I will make a try. :slight_smile:

First, if you want call site information, you need TD not BU. Second,
TD still isn't perfect. I have a series of patches that improve DSA's
indirect call handling, but they are ugly and not yet ready for
incorporation into mainline. If you really want to try them I can send
you patches, or you can check out my monotone tree.

I also have some clients of the TD call graph that make use of it (well
make use of the Global list in the function pointer's DSNode), such as a
devirtualizer. Again, I can send you a copy or you can checkout my
tree.

Andrew

> > > But my code does not always works: if the arguments are not pointer,
> > > CompleteBUDataStructures not records it. So, if you want to find all indirect
> > > calls, you maybe have to repair CompleteBUDataStructures. :slight_smile:
> >
> > Not surprising, CBU is trying to do something entirely different that
> > what you are.
> >
> > > If you do not use BUDataStructures, you can do it yourself: find all load/store
> > > instructions with its destination is function type.
> >
> > You may want to look at how the call graph builder works. It finds all
> > indirect call sites, and also finds all functions whose address escapes
> > (this is, may be called indirectly).
>
> The BasicCallGraph class only lines out the indirect calls(makes the caller point to external node),
> but do not resolves them using alias analysis such as DSA.
> I think DSA solve this problem for interested call sites by finding the corresponding globals(i.e. the functions)
> for the callsite DSnode. Maybe 澶忎竴姘� just wanted to point out that DSA does not take all callsite into count.
> But just as suggested in callgraph.h, "As an extension in the future, there may be multiple nodes with a null
> function. These will be used when we can prove (through pointer analysis) that an indirect call site can
> call only a specific set of functions."

First, if you want call site information, you need TD not BU. Second,
TD still isn't perfect. I have a series of patches that improve DSA's
indirect call handling, but they are ugly and not yet ready for
incorporation into mainline. If you really want to try them I can send
you patches, or you can check out my monotone tree.

I also have some clients of the TD call graph that make use of it (well
make use of the Global list in the function pointer's DSNode), such as a
devirtualizer. Again, I can send you a copy or you can checkout my
tree.

That would be great! Unfortunately, I did not find your personal cvs in your homepage.
So could you tell me your cvs entry and please point out briefly those codes in interest?
Of course, It's ok to sent the patches to this email.
Thank you very much !