DSA / poolalloc: incorrect callgraph for indirect call

Hello,

I am trying to apply DSA (from the poolalloc project - I'm on LLVM 3.2)
on the following C program and found that the generated callgraph
over-approximates the callees for the simple indirect call.

#include <stdio.h>
__attribute__((noinline)) static int f1(int arg1, int arg2) {
    return arg1 + arg2;
}
__attribute__((noinline)) static int run_func(int (*fptr)(int, int), int
arg1, int arg2) {
    return (*fptr)(arg1, arg2);
}
__attribute__((noinline)) static int foo() {
    return run_func(&f1, 1, 2);
}
int main(int argc, char *argv) {
    printf("Main: %p\n", &main);
    printf("Sum: %d\n", foo());
}

Using the TDDataStructures, I would expect that the callgraph of above
program shows that run_func can only call f1. However, it seems that DSA
is falling back to an address-taken approach and tells us it can also
call main.

I attached the bitcode of above C program, as well as the LLVM pass that
generates a callgraph. I'd be grateful for any clues you can provide.

Thanks,
Victor

test.ll (2.2 KB)

HelloPass.cpp (3.9 KB)

Looking at your code, you're using EQTDDataStructures (EQTD). Try using TDDataStructures (TD) instead and see if you get a more accurate result. You only need EQTD if you need every target of an indirect call to have the same DSGraph, and you don't need that if all you need is a call graph.

Also, I recently discovered that someone had updated the DSA code to build with LLVM mainline. I took a snapshot of that and put it up at https://github.com/jtcriswell/llvm-dsa.

Regards,

John Criswell

I am trying to apply DSA (from the poolalloc project - I'm on LLVM 3.2)

on the following C program and found that the generated callgraph
over-approximates the callees for the simple indirect call.

#include <stdio.h>
__attribute__((noinline)) static int f1(int arg1, int arg2) {
     return arg1 + arg2;
}
__attribute__((noinline)) static int run_func(int (*fptr)(int, int), int
arg1, int arg2) {
     return (*fptr)(arg1, arg2);
}
__attribute__((noinline)) static int foo() {
     return run_func(&f1, 1, 2);
}
int main(int argc, char *argv[]) {
     printf("Main: %p\n", &main);
     printf("Sum: %d\n", foo());
}

Using the TDDataStructures, I would expect that the callgraph of above
program shows that run_func can only call f1. However, it seems that DSA
is falling back to an address-taken approach and tells us it can also
call main.

Looking at your code, you're using EQTDDataStructures (EQTD). Try using
TDDataStructures (TD) instead and see if you get a more accurate result.
You only need EQTD if you need every target of an indirect call to have the
same DSGraph, and you don't need that if all you need is a call graph.

Also, I recently discovered that someone had updated the DSA code to build
with LLVM mainline. I took a snapshot of that and put it up at
https://github.com/jtcriswell/llvm-dsa.

It seems I accidentally attached the wrong source file: the results when
using TDDataStructures were actually the same. I then started trying
different datastructures, all of no luck. Does this mean that the current
DSA implementation cannot handle above case? Do you know why?

With kind regards,

Victor van der Veen