Finding the entry point function in a LLVM IR

Hi ,

Given the following LLVM IR :

define i32 @foo(i32 %l) #0 {
entry:
%add = add nsw i32 %l, 3
ret i32 %add
}

; Function Attrs: nounwind ssp uwtable
define i32 @boo(i32 %k) #0 {
entry:
%call = call i32 @foo(i32 %k)
ret i32 %call
}

; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
entry:
%add = add nsw i32 1, 2
%add2 = add nsw i32 2, %add
ret i32 %add2
}

I want to be able to find out that main is the entry point function of the program.
main and boo both do not have any predecessors or successors , such that I can make a cfg to figure out who’s calling whom ?

Is there a way I can achieve this ?

Thanks a ton for the help!!

Thanks

The fact that main is the entry point is not known to LLVM (except in a couple of places that special-case main, such as the internalise pass), because it is an artefact of C/C++, not a generic property. On most *NIX platforms, the real entry point for a program is something like __start or _start, which then call main. In most compilation units, there is no single entry point, because they do not contain the program entry point and so can be entered by any externally visible function.

It might help if you explained why you need this.

David

If you want to know which functions are (or may be) called from where, in the entire program, then you will need to do some sort of “LLVM-IR Linking” (there are tools that will do that for you, such as “llvm-link”).

Of course, even then, there’s possible cases where it’s impossible to know whether a function is ACTUALLY called until at runtime - function pointers, including those in vtables, may or may not actually get called, depending on the exact dynamic behaviour of the code.

Then there will be functions implemented outside of the LLVM-IR for the program anyway. atexit is a good example of a function that takes a function pointer. Your code will not know what (if anything) atexit does with that function pointer, or if/when that function gets called. Of course, we, as humans, know how atexit works and when the function gets called, but some code will not know that unless you write code to understand it’s behaviour - and there are many types of functions that take a pointer to a function, some of which are much more complex than atexit - for example signal handlers or call-backs that gets called on errors. Imagine a function being called only when a memory allocation fails…

As David says, it’s of course highly dependent on what you are trying to achieve, exactly what approach you should take (or if there is an approach that is meaningful at all).

Thank You David and Mats for the reply,

The reason I need to know that main is the entry point is as follows :

I have a dead code elimination pass that removes the function call for boo. boo was initially called from the main function , but since the return in the main function has no dependency on boo, boo function call is removed.

Now I want to remove the function definition of the functions that are not called. Notice that boo and main , both do not have any function calls to them .
So, if I were to traverse all the functions over the entire module and use the API F->use_empty() to check for functions calls both boo and main return true and be deleted. I want to save main function definition from being deleted.

As correctly mentioned by you above, there could be more functions similar to main , which can be called externally and I do not want them to be deleted. I am looking for a solution which tells me that this particular function is an entry point to the program and will have no function calls to it.

Thanks

This is not sound. The linkage of boo means that it is externally visible. There is no guarantee that it will not be called from another compilation unit. If boo had internal or private linkage, then you would be safe to delete it as soon as its uses count dropped to zero (and LLVM’s dead code elimination pass will do exactly that).

If you run the Internalize pass first, then it will mark functions that are not reachable from main as internal and then DCE can delete them.

David

Thanks David !
Appreciate the guidance. :slight_smile:

Ill go through the internalize pass , hopefully that should also give me an idea to find the entry point .

Thanks