Walking thru CallGraph bottom up

Hi all,

I would like to create a Pass that given an IR instruction walks starting from that instruction up to the main function
to identify all the functions call that have been made to call that instruction.

Is it possible? What kind of Pass should I create?

Thanks
Best,
Simone

Simone Atzeni
simone.at@gmail.com
+1 (801) 696-8373

Hi all,

I would like to create a Pass that given an IR instruction walks starting from that instruction up to the main function
to identify all the functions call that have been made to call that instruction.

Is it possible? What kind of Pass should I create?

Yes, it is possible. I think a ModulePass would be most appropriate, though a FunctionPass may be alright.

To get the call graph, you can use LLVM's CallGraph analysis. If you need to handle function pointers more accurately than LLVM's internal CallGraph analysis does, you can use DSA's CallGraph analysis (which has the same interface but may only work with LLVM 3.2 and earlier LLVM releases).

-- John T.

Thanks John.

I guess I will use a ModulePass, so when I am implementing the “runOnModule” function,
do I have to loop through all the functions, for each functions all the BasicBlocks and for each BasicBlock all the instructions
or given the Module I have to call the CallGraph directly?
Is there an example out there? I can’t find anything.

Thanks.
Simone

Thanks John.

I guess I will use a ModulePass, so when I am implementing the “runOnModule” function,
do I have to loop through all the functions, for each functions all the BasicBlocks and for each BasicBlock all the instructions

If you know the Instruction, you can get it's basic block using Instruction::getParent(), and then get its enclosing function using BasicBlock::getParent().

Once you know the enclosing function, you can use the CallGraph pass to find which functions call it, and then repeat the procedures for functions calling that function, etc.

or given the Module I have to call the CallGraph directly?
Is there an example out there? I can’t find anything.

It uses the DSA CallGraph pass, but http://llvm.org/viewvc/llvm-project/safecode/branches/release_32/lib/InsertPoolChecks/CFIChecks.cpp?revision=189030&view=markup might provide a decent example.

Regards,

John Criswell

I think I got it and the example is pretty, however for what I understand, I can get the CallGraphNode for a given function F that has a list that represents all the functions that F is calling, but how can I get the list of the functions that are calling F? There is not a sort a similar list?

Thanks.
Simone

Hi Herbie,

thanks for you answer and explanation.

Also, if any of the functions are external, you are completely stuck (unless you put everything together with lld).

I am indeed having a problem regarding external function.
I my program is just one file everything work and I can access all the functions.
However, if it has multiple files I have a lot of unresolved pointers to external functions.
I am creating separately all the byte code files for each file of my program.
Then I link all of them together with llvm-link, I thought it was enough but actually I can not access the functions that are in other files even if I am link all of them together.

Do you have any idea/suggestion how to solve this problem?

Thanks!
Best,
Simone

Hi Herbie,

thanks for you answer and explanation.

Also, if any of the functions are external, you are completely stuck (unless you put everything together with lld).

I am indeed having a problem regarding external function.
I my program is just one file everything work and I can access all the functions.
However, if it has multiple files I have a lot of unresolved pointers to external functions.
I am creating separately all the byte code files for each file of my program.
Then I link all of them together with llvm-link, I thought it was enough but actually I can not access the functions that are in other files even if I am link all of them together.

Do you have any idea/suggestion how to solve this problem?

Obviously, if you want to analyze the call graph of an entire program, you need to collect the information from across compilation units. In LLVM, the best way to do that is to link the bitcode files together into a single bitcode file. This can either by done using llvm-link (which requires changing the Makefiles of most programs) or by adding your passes into libLTO (which is more work but allows you to do whole-program analysis with little/no modification to program Makefiles).

Of course, if you have external library code (e.g., libc) that is not in LLVM bitcode format, you can't analyze it using an LLVM pass. One option is to know what these functions do (i.e., treat them as a special case). Another option would be to try to convert them into LLVM bitcode format using Revgen or BAP, but I'm not sure how well that works.

Most analyses understand the semantics of the C library function so that they can treat them specially and then assume that any function that has its address taken or has externally visible linkage can be called by external library code. The "external node" in the CallGraph analysis should state that it can call any function with externally visible linkage.

Regards,

John Criswell

Obviously, if you want to analyze the call graph of an entire program, you need to collect the information from across compilation units. In LLVM, the best way to do that is to link the bitcode files together into a single bitcode file. This can either by done using llvm-link (which requires changing the Makefiles of most programs) or by adding your passes into libLTO (which is more work but allows you to do whole-program analysis with little/no modification to program Makefiles).

Hi John,

this is actually what I am doing.
I have a simple program that has a main that call 2 functions.
Those 2 functions are in two different .c files.
So I create the BC file with clang for the 3 files.
I link them together with llvm-link and then I disassemble it.
However when I run my pass it still can’t see the external calls.

Why is that?

Thanks.
Simone