Identifying classes and its member functions

Hi,

I am new to LLVM and am trying to identify all the member functions of a class.

Currently, I am converting the source code into the IR using a llvm frontend. I have written a module-pass which will take in the IR thus generated and iterate over the various types (classes/structs) defined in the source. For every such type, I will iterate over the sub-types (types of its members). I was expecting to see a member of type FunctionTyID to indicate a member function but this is not the case. Hence, I am not able to iterate over the member functions.

On looking into the IR (dumped out in human readable format), I see that the class defined in the source code has been stripped out of its member functions and only a structure containing the member variables is created. The member functions are written out separately with their mangled names. Hence, I suspect if it is possible to iterate over only the member functions of a particular class.

Kindly let me know if my approach is correct when it comes to identifying only the member functions of a class. If not, please suggest the best method of doing so. Please let me know if you need any further information/clarification.

Regards,
Sandeep

On looking into the IR (dumped out in human readable format), I see that the class defined in the source code has been stripped out of its member functions and only a structure containing the member variables is created. The member functions are written out separately with their mangled names. Hence, I suspect if it is possible to iterate over only the member functions of a particular class.

Indeed, as you’ve seen (& so far as I know), LLVM IR is too low level for your task - there are no member functions at this level, they’ve all been transformed into free functions and structs.

What kind of optimization were you hoping to implement based on this information?

If you aren’t trying to implement an optimization, but instead try to process source code in some way (indexing, source to source transformations, etc) then you might want to use clang instead.

We have built a tool which will take in a CDFG (in a particular format) and do some design automation based on it. The effort is to extend this tool to take an input from in the form of a systemC (extension of C++) code. The IR was used to generate the CDFG of the source code.

But the requirement is to generate the CDFG of only a specific function and hence I am looking to iterate over the member functions only. Currently, I was achieving this by filtering out the function by their name. But evidently this is inefficient.

Will I be able to achieve this (iterating over member functions alone and generate a CDFG of it) using clang?

Regards,
Sandeep

But the requirement is to generate the CDFG of only a specific function and hence I am looking to iterate over the member functions only.

If you just need a specific function, why are you searching through only member functions? You could keep this general by just accepting a mangled (or unmangled, if you include argument types & all the rest of the stuff that goes into mangling - I expect there’s an unmangling API somewhere in clang) name & analyze that uniquely identified function.

You could even separate this function out from the LLVM bitcode so you can run the optimizers on the function alone, potentially.

But no, I don’t think Clang will give you a CDFG - either it’ll compile C, C++, ObjC* to LLVM bitcode, or it has an API over the AST to do source analysis, but it’s not really designed for much in between, so far as I know.