Enumerating machine functions

Hi,

I have a doubt about running passes on machine code. We are
implementing a transformation on machine code that consists in
analyzing a series of functions, extracting some aggregate properties,
and then using the extracted information to optimize each functions.

I am not familiar with LLVM internals, so I am not sure how to
implement each step. From the documentation, it seems that the only
way to work on the machine-dependent representation is to write a
MachineFunctionPass. However, such pass would be run separately on
every function, while what we need is pretty much a ModulePass running
on machine code, or some way to enumerate all the MachineFunction
objects. Is there any way to achieve this?

Thanks,
Lorenzo

Hi Lorenzo,

I have a doubt about running passes on machine code. We are
implementing a transformation on machine code that consists in
analyzing a series of functions, extracting some aggregate properties,
and then using the extracted information to optimize each functions.

I am not familiar with LLVM internals, so I am not sure how to
implement each step. From the documentation, it seems that the only
way to work on the machine-dependent representation is to write a
MachineFunctionPass. However, such pass would be run separately on
every function, while what we need is pretty much a ModulePass running
on machine code, or some way to enumerate all the MachineFunction
objects. Is there any way to achieve this?

one of the design goals of LLVM codegen is to be able to codegen one function
at a time, so the front-end can generate a function, codegen it, then throw
away the function before moving onto the next function.

Ciao, Duncan.

Hi Duncan,

thanks for your reply. So if I understand correctly MachineFunction
objects are converted to machine code one at a time, and each object
is "thrown away" after having been converted. Seems that the only way
to achieve what I have in mind is to write the machine code for all
the functions to a file, and then load it and process it in a separate
tool (possibly based on LLVM-mc). Is this correct?

Thanks,
Lorenzo

Hi Lorenzo,

thanks for your reply. So if I understand correctly MachineFunction
objects are converted to machine code one at a time, and each object
is "thrown away" after having been converted.

actually that's not quite what I meant. I was saying that the code
generators must not assume that the whole module is available. For
example, it would be wrong when codegening a function to examine the
bodies of any functions it calls in the same module, or to examine
the initial values for any global variables it uses etc. The reason
that this would be wrong is that it won't work if a front-end creates
a module which only contains declarations, then successively inserts
a body for each function into the IR, codegens that function, then
deletes the body for that function before moving on to the next one.
llvm-gcc has code for doing this for example, though it is currently
turned off.

  Seems that the only way

to achieve what I have in mind is to write the machine code for all
the functions to a file, and then load it and process it in a separate
tool (possibly based on LLVM-mc). Is this correct?

I think you should do this kind of thing as a module level pass on the
LLVM IR, i.e. before doing code generation.

Ciao, Duncan.

I will consider if we can run at least part of the pass on the IR
form. Thanks for the input!

Lorenzo